Uncategorised

Foundational monographs & books

Moretti, F. (2013). Distant Reading. Verso.
— Seminal short book collecting Moretti’s essays that popularized “distant reading” and network/graph approaches to literary history.
Jockers, M. L. (2013). Macroanalysis: Digital Methods and Literary History. University of Illinois Press.
— Practical, method-focused treatment of large-scale literary computing and macroanalysis.
Ramsay, S. (2011). Reading Machines: Toward an Algorithmic Criticism. University of Illinois Press.
— Reflexive, methodological argument for integrating computational procedures into literary interpretation.
Manovich, L. (2020). Cultural Analytics. MIT Press.
— Presents visual/cultural analytics methods (esp. for image/visual corpora) that strongly overlap with computational comparative work.
Underwood, T. (2019). Distant Horizons: Digital Evidence and Literary Change (if you want interpretive+method syntheses — see his articles and public scholarship for methodology and case studies). (see Underwood articles below)

Influential papers

Aiden, E. L., & Michel, J.-B. (2013). Uncharted: Big Data as a Lens on Human Culture. Riverhead Books.

Algee-Hewitt, M., et al. (2016). Canon/archive: Large-scale dynamics in the literary field.

Baillot, A., et al. (2018). Scholarly digital editions and computational reuse.

Bamman, D., Underwood, T., & Smith, N. A. (2014). A Bayesian mixed effects model of literary character. In Proceedings of ACL 2014.

Bizzoni, Y., & Lappin, S. (2018). Modeling literary characters using distributional semantics.

Bizzoni, Y., et al. (2014). Character interaction modeling in European theatre.

Bode, K. (2012). Reading by Numbers: Recalibrating the Literary Field. Anthem Press.

Bode, K., & Dixon, R. (2009). Resourceful reading.

Borek, L., et al. (2016). Open science and FAIR data in European literary corpora.

Börner, I., et al. (2020). Linked open data and the DraCor API.

Brody, S., & Lapata, M. (2008). Unsupervised aspect modeling for literary texts.

Burnard, L., et al. (2020). ELTeC Encoding Guidelines (Level 0).

Casanova, P. (1999). La République mondiale des lettres. Seuil.

CLS INFRA Consortium. (2021–2025). Computational Literary Studies Infrastructure. European Union Horizon Programme.

COST Action CA16204. (2017–2022). Distant Reading for European Literary History.

Craig, H., & Kinney, A. (Eds.). (2009). Shakespeare, Computers, and the Mystery of Authorship. Cambridge University Press.

Damrosch, D. (2003). What Is World Literature? Princeton University Press.

DARIAH-EU. (2014–). Digital Research Infrastructure for the Arts and Humanities.

Digital Humanities Quarterly. (Various issues).

Digital Scholarship in the Humanities. (Various issues).

Eder, M. (2016). Rolling stylometry. Digital Scholarship in the Humanities, 31(3), 457–469.

Eder, M., Rybicki, J., & Kestemont, M. (2016). Stylometry with R: A package for computational text analysis. The R Journal, 8(1), 107–121.

Edmond, J., et al. (2020). Research infrastructures and digital humanities in Europe.

ELTeC Consortium. (2020–). European Literary Text Collection.

Evert, S., et al. (2017). Understanding topic models in literary research.

Fischer, F., et al. (2017). Le drame comme réseau de relations: L’analyse de réseau appliquée à l’histoire du théâtre. Revue d’historiographie du théâtre.

Fischer, F., et al. (2019). Programmable corpora: Introducing DraCor. In Proceedings of DH 2019.

Fischer, F., & Börner, I. (2021). Linked data modeling in DraCor.

Fiormonte, D., et al. (2015). The Digital Humanist: A Critical Inquiry.

Fokkens, A., et al. (2014). Computational modeling of narrative perspective.

Gerlach, M., & Font-Clos, F. (2019). A standardized Project Gutenberg corpus for quantitative literary studies.

Gold, M. K. (Ed.). (2012). Debates in the Digital Humanities. University of Minnesota Press.

Heuser, R., & Le-Khac, L. (2012). A quantitative literary history of 19th-century fiction.

Hoover, D. L. (2004). Testing Burrows’s Delta. Literary and Linguistic Computing.

Hoover, D. L. (2007). Corpus stylistics and authorship attribution.

Jacobs, A. M. (2015). Towards a neurocognitive poetics model.

Jannidis, F., et al. (2015). Improving Burrows’s Delta. Digital Humanities Quarterly.

Jannidis, F., Kohle, H., & Rehbein, M. (Eds.). (2017). Digital Humanities: Eine Einführung.

Jockers, M. L. (2013). Macroanalysis: Digital Methods and Literary History. University of Illinois Press.

Juola, P. (2013). Authorship attribution. Foundations and Trends in Information Retrieval.

Kestemont, M. (2014). Function words in authorship attribution: From black magic to theory? In CLFL Workshop Proceedings.

Kestemont, M., et al. (2018). Cross-domain authorship attribution.

Koppel, M., Schler, J., & Argamon, S. (2009). Computational methods in authorship attribution.

Kuhn, J., et al. (2018). Computational linguistics and literary corpora.

Labatut, V., & Bost, X. (2019). Extraction and analysis of fictional character networks.

Liu, A. (2013). Where is cultural criticism in the digital humanities? In Debates in the Digital Humanities.

Luyckx, K., & Daelemans, W. (2011). Authorship attribution and verification.

Manovich, L. (2020). Cultural Analytics. MIT Press.

Michel, J.-B., et al. (2011). Quantitative analysis of culture using millions of digitized books. Science, 331(6014), 176–182.

Moretti, F. (2000). Conjectures on world literature. New Left Review, 1, 54–68.

Moretti, F. (2013). Distant Reading. Verso.

Odijk, J. (2016). CLARIN infrastructure and literary research.

Odebrecht, C., et al. (2019). ELTeC: Building a European literary corpus. DH 2019.

Patras, R., et al. (2022). Named entity linking in ELTeC corpora. In LDL Workshop (ACL).

Piper, A. (2015). Novel deviance and genre change.

Piper, A. (2018). Enumerations: Data and Literary Study. University of Chicago Press.

Plecháč, P. (2018). Czech verse stylometry.

Ramsay, S. (2011). Reading Machines: Toward an Algorithmic Criticism. University of Illinois Press.

Reiter, N. (2015). Automated analysis of plot structure.

Reiter, N., et al. (2017). Computational approaches to narrative modeling.

Rybicki, J. (2012). The great mystery of the (almost) invisible translator. Digital Humanities Quarterly.

Sahle, P. (2016). Digitale Editionswissenschaft.

Schöch, C. (2013). Topic modeling genre in French drama.

Schöch, C., & Eder, M. (Eds.). (2020–2022). The Distant Reading Compendium.

Schöch, C., et al. (2021). Creating the European Literary Text Collection (ELTeC): Challenges and perspectives. Modern Languages Open.

Schreibman, S., Siemens, R., & Unsworth, J. (Eds.). (2004/2016). A Companion to Digital Humanities.

Skorinkin, D., et al. (2021). Network analysis of European dramatic traditions.

Sprugnoli, R., & Tonelli, S. (2019). Event extraction in historical texts.

Tangherlini, T. (2013). Big data and folklore studies.

Terras, M., Nyhan, J., & Vanhoutte, E. (Eds.). (2013). Defining Digital Humanities.

Trilcke, P. (2013). Netzwerkanalyse dramatischer Texte.

Trilcke, P., et al. (2015). Social network analysis in dramatic texts. Digital Humanities Quarterly.

Underwood, T. (2016). The life cycles of genres. Journal of Cultural Analytics.

van Cranenburgh, A. (2018). Genre classification in Dutch novels.

van Zundert, J. (2012). If you build it, will we come? Large-scale digital infrastructures in DH.

Major projects, corpora & infrastructures (essential for comparative + computational work)

ELTeC (European Literary Text Collection) — balanced, comparable national subcorpora (TEI) intended specifically for comparative, multilingual distant-reading research; curated corpora across many European languages.
DraCor — The Drama Corpora Project — open infrastructure / API for >4,000 TEI-encoded dramatic texts (Ancient → 20th C); emphasizes “programmable corpora.”
PoeTree — open infrastructure for computational analysis of poems from several traditions
Eighteenth Century Collections Online (ECCO / ECCO-TCP) — large archive of 18th-century English-language printed works; used in long-span literary-historical corpora.

wp7olgat

Digital Methods for a Comparative Study of Central European Literary History

Location: Institute of Polish Language Polish Academy of Sciences, Krakow, Mickiewicza 31

02. 10. 2025. Thursday

Conference Day

Zoom: https://us02web.zoom.us/j/83032400147?pwd=BcczHFovkfuMasXtzK2c1Pliim1I2t.1

9:30-11:00 – Section 1. Chair: Jessie Labov

Marko Juvan (Research Centre of the Slovenian Academy of Sciences)

Snejana Ung (Lucian Blaga University of Sibiu): Inter-peripheral circulation of the novel between Romania and Yugoslavia

Kata Dobás (Institute for Literary Studies, Budapest) and Botond Szemes: Literary memory of CEE in the Wikipedia. Automatization of data cleaning.

11:30-13:00 – Section 2. Chair: Botond Szemes

Péter Róbert (University of Szeged): Multilingual Analysis and Visualization of Metadata and Texts in Central and Eastern European Literary Studies

Artjoms Sela and Petr Plecháč (Czech Academy of Sciences): PoeTree and Complit

Anna Mędrzecka-Stefańska: Polish Poetry Corpus and Complit

13:00-14:30 – Lunch

14:30-16:00 – Section 3. Chair: Artjoms Sela

Jacek Bakowski (Institute of Polish Language): Semantic change detection between languages

Benjamin Nagy (Institute of Polish Language): „SHAP Workflow for Interpretation”

Joanna Byszuk (Institute of Polish Language): Comparative possibilities in distant viewing

03. 10. 2025. Friday

Workshop Day

10:00-10:45 – Metadata analysis 1. Guest presentation (online): Georgii Korotkov, Valeriia Korotkova (Stanford University)

11:00-12:30: Topic modelling of novels. Comparative approach of Polish and Hungarain corpora (Patryk Hubar, Botond Szemes)

12:30-13:30 – Lunch

13:30-14:30 – Metadata analysis 2. (Jessie Labov. Róbert Péter); Guest presentation (online): Philip Gleissner (Ohio State University)

15:00-16:00 – Closing & Future

vf logotype blue page 0001

1000007369 imresizer

1000007469 imresizer

1000007381 imresizer

1000007470 imresizer

The first day of the second workshop took the form of a mini-conference, during which all participants presented the results of their recent research. The aim of the day was to showcase the diversity of studies conducted by members of the group within the broad framework of Computational and Comparative Studies.

We also hosted three guest speakers: Marko Juvan (Slovenian Academy of Sciences), Snejana Ung (Lucian Blaga University of Sibiu), and Róbert Péter (University of Szeged). The opening lectures by Juvan and Ung provided a theoretical foundation for the workshop’s theme, addressing the notion of “small nations” and regional approaches in Central and Eastern Europe. They discussed how these concepts can be defined and compared, and what challenges computational projects face in this context.

Next, Kata Dobás and Botond Szemes presented their ongoing work on the digital literary memory of the region, based on biographical metadata. They manaeged to extend the scope of the research since March and showed promising results of the automatization of data cleaning. Róbert Péter then introduced AVOMBAT, a new platform for the Analysis and Visualization of Metadata and Texts.

The following “poetry presentations” explored the intersection of geography and literature: Petr Plecháč demonstrated connections between geography and poetic traditions (mainly through prosody); Artjoms Sela examined the geographical imagination of European poetry; and Anna Mędrzecka-Stefańska presented the development of the Polish Poetry Corpus.

In the final session, Jacek Bąkowski introduced a metric for measuring semantic similarity between different languages, which may serve as a foundation for future research – mainly in connection with the topical research of novels (see below). Ben Nagy gave a detailed methodological introduction to SHAP values as a way of identifying key features in classification procedures, and Joanna Byszuk introduced the emerging field of Distant Viewing in the context of Central and Eastern Europe.

The first day highlighted the diversity yet interconnectedness of the participants’ research and inspired lively discussions. Some projects are already ready for publication, while others represent promising ideas for further development. The team agreed to continue both individual and collaborative research, which became the focus of the second day.

On the second day, the workshop centered on one of the key ideas from the first meeting in Budapest: comparing topical similarities between national literary traditions. Patryk Hubar and Botond Szemes created comparable corpora of Polish and Hungarian novels and developed a pipeline for generating relevant topics. Two methods were tested—LDA topic modeling and BERT-Topic. After promising results from smaller samples, the team decided to continue exploring the question: What are the differences and similarities in the topics of 19th-century Hungarian and Polish novels?

Finally, Jessie Labov organized a metadata workshop, featuring guest speakers (via Zoom) who introduced new methods for analyzing biographical and bibliographical metadata. This line of research will also be continued by group members, applying the presented methods to the context of Central and Eastern Europe.

Overall, the workshop fully achieved its aims: it brought together researchers from the region, fostered collaboration, and opened new perspectives for comparative literary studies using digital methods. A cohesive research group has now formed, with several promising joint projects. The team decided both to pursue further funding opportunities and to begin collecting papers on the various topics for a joint publication in a prestigious academic journal.

See below the Program of the Workshop no. 1., some photos and summary of the proceedings with detailed bibliography:

budapest ws program page 0001

20250321 1100111 min1

20250320 153213 min1

20250321 1044571 min1

The workshop fully achieved its objectives and successfully aligned researchers toward a shared research agenda. The opening lecture, delivered by Botond Szemes on the first day, outlined two broad directions for potential collaboration: the study of bibliographic and biographical metadata, and the text mining analysis of comparable literary corpora.

Jessie Labov, who will continue to coordinate related research in the future, led the discussion on the first category. Key contributions to this discussion were made by Kata Dobás, editor of the HUN-REN RCH Institute of Literary Studies' Wikibase-based semantic knowledge graph; Snejana Ung, who gave a presentation on literary relations between Romania and South Slavic countries based on translation metadata; and Anna Mędrzecka-Stefańska, who spoke on Polish biographical research. Based on the range of existing contributions, the group determined that it would be more productive to summarise and connect current infrastructures rather than build an entirely new database from scratch.

A public lecture series titled "Metadata Afternoon" was held on 20 March as part of this initiative. Presentations were given by Botond Szemes, Kata Dobás, Patryk Hubar, Snejana Ung, and guest speaker Péter Király (joining online), who spoke about translation patterns in Hungarian literature using large-scale data analysis. Király’s talk generated substantial interest, and the group decided to maintain closer contact with him in future projects. In addition to the group members, attendees included Gábor Palkó, Zsófi Fellegi, and Barbara Bobák from the DigiPhil project of the HUN-REN RCH Institute of Literary Studies; Tamás Scheibner, a researcher at Eötvös Loránd University (ELTE) and HUN-REN RCH; Róbert Péter from the University of Szeged, developer of AVOMBAT infrastructer; and Patrick Joula, a visiting lecturer at ELTE’s Department of Digital Humanities.

The second research strand—literary text analysis—was divided into two subfields. Petr Plecháč and Artjoms Sela introduced the PoeTree database and interface. They discussed the potential of algortihmic analysis of poetry through this plaform. Given the promising development of this project, much of the conversation focused on its potential applications for regional literary historiography. Ben Nagy contributed important insights, especially on the automatic detection of poetic imitation and literary influence.

A further lively discussion centered on the comparative study of the region’s fiction. Patryk Hubar, Snejana Ung, and Botond Szemes noted the availability of comparable corpora of novels in Polish, Romanian, and Hungarian. Following a suggestion by Joanna Byszuk, Artjoms Sela, and Jacek Bakowski, the group agreed to begin with a thematic comparison: after independently modeling the themes in each corpus, researchers will examine similarities and differences across the region’s novels. This strand of research will be coordinated by Botond Szemes going forward.

Below is a list of institutions from the CEE region that represent a wider group of scholars in Comparative Literary Studies:

Bosnia/Hercegovina	https://www.ff.unsa.ba/index.php/en/about-department-of-comparative-literature-and-library-sciences
Bulgaria	https://ilit.bas.bg/en/departments/department-of-comparative-literature.html
Croatia	https://www.ffzg.unizg.hr/kompk/index.php?show=page.php&idPage=23
Czech Republic	https://uclk.ff.cuni.cz/komparatistika/vyucujici-na-oboru-komparatistika/
Estonia	https://sisu.ut.ee/evka/?lang=en
Hungary
Lithuania	http://www.lla.lt/en
Poland	https://rejestr.io/krs/468924/polskie-stowarzyszenie-komparatystyki-literackiej
Romania	https://www.algcr.ro/en/home-3/
Slovakia	https://usvl.sav.sk/wp/?lang=en
Slovenia	http://sdpk.si/

Részletek: Kategória: Uncategorised

Project Title: Digital Methods for a Comparative Study of Central European Literary History

Project Partners:

Research Centre for Humanities, Institute of Literary Studies: https://iti.abtk.hu/en
Institute of Polish Language, Polish Academy of Sciences: https://ijp.pan.pl/en/, https://computationalstylistics.github.io/
The Institute of Literary Research of the Polish Academy of Sciences: https://ibl.waw.pl/en/
Czech Academy of Sciences, Institute of Czech Literature: https://ucl.cas.cz/en/

The project is supported by the Visegrad Fund (Visegrad Grant, Project 22430232)

vf logotype blue page 0001

Description of the project

Literature and cultural studies are increasingly dealing with the question of regionality, and in this context with the Central European or Visegrad region as a unique field of world literature. Typically, research examines cultural relations between nations in this region, their historical specificities, and how the region is represented in global literature. Despite the large textual sample required for such studies, Digital Humanities (DH) methodologies are seldom employed. While there are a few computational literary studies in the area, there has been no coordinated effort. This project aims to bring together leading DH researchers from the Visegrad countries to develop a regional literary historiography using digital literary studies. By doing so, we will identify the limitations of existing DH methods for less-researched languages like Polish, Czech, and Hungarian, and create new approaches based on previous experiences. This research is vital not only from a methodological standpoint but also for gaining deeper insights into the region’s shared yet diverse history. Data-driven analysis can reveal hidden patterns that traditional methods might overlook. For instance, we can examine how the cultural memories of Visegrad countries are structured, which translations historically shaped the literary field, and what stylistic or thematic features characterize the literature of certain periods. Ultimately, this project aims to deepen our understanding of Central European literary traditions using innovative digital tools.

To establish a regional literary historiography based on Digital Humanities (DH), we aim to bring together prominent researchers from the Visegrad countries for collaborative research. Scholars from institutions such as HUN-REN RCH, Institute for Literary Studies (Budapest), the Polish Academy of Sciences (Krakow and Warsaw), and the Czech Academy of Sciences (Prague) will participate. Each institution will contribute 2-3 researchers, ensuring effective collaboration within this small group. Guest lecturers from other Central and Eastern European countries, like Romania and Slovenia, will further enrich the project by sharing their perspectives. We plan two major workshops. The first, held in Budapest, will focus on sharing experiences and developing research directions. Afterward, participants will collaborate online, with smaller groups working on specific topics. The second workshop, six months later in Krakow, will involve presenting findings and will double as a short conference open to a wider audience. This project will not only enhance collaboration among Visegrad countries but also lay the groundwork for a cohesive regional historiography using DH techniques.

We plan to develop two key approaches in the research: - Historical-poetic comparison of similar corpora (like PoeTree, ELTeC or DraCor) from the region, focusing primarily on the 19th and early 20th centuries. This will involve methods such as stylometry and text mining to compare literary works across national boundaries. - Comparison of bibliographic and biographical metadata, analyzed from a literary-historical perspective. While similar research has been conducted at a national level, or in comparisons between two countries, this project will integrate and expand upon those efforts. By linking data across multiple countries, we aim to uncover new insights and provide a broader perspective on regional literary history. In terms of metadata research, combining existing national studies will enhance understanding of the region’s cultural and literary dynamics in ways not previously possible. In the historical-poetic comparison, the project is breaking new ground, as no prior research has systematically explored the literary history of this region in this manner. By working together, researchers from different national contexts can generate collective knowledge and interpretations that go beyond what could be achieved in a purely national framework, opening the door to genuinely innovative discoveries about Central European literary history.

wp7olgat

További cikkeink …

1. oldal / 2

Bibliography

Foundational monographs & books

Influential papers

Major projects, corpora & infrastructures (essential for comparative + computational work)

Workshop 2

Workshop 1

Community

Digital Methods for a Comparative Study of Central European Literary History

További cikkeink …

Menü