The Lost Generation Corpus

Project: The Lost Generation Corpus

The Lost Generation, as a term, refers to expatriate U.S authors who lived and worked in Europe during the first half of the 1900s. This project currently deals mainly with the group active in Paris during the early 1900s, where Gertrude Stein coined the term ”The Lost Generation” when referring to the young group of authors coming of age during, or directly after, the Great War.

Project information

Project manager
Daniel Ihrmark
Participating organizations
Linnaeus University
ComparativeLliterature, Computational Linguistics, Stylistics, History, Digital Humanities
(Department of Languages, Faculty of Arts and Humanities)

More about the project

The Lost Generation includes some of the most well-known authors within American literature, such as F. Scott Fitzgerald, Gertrude Stein, Ernest Hemingway and T.S Elliot. The moniker refers to authors who came of age during, or shortly after, the Great War. Their literature and poetry dealt with the break their generation experienced with the world of their parents, and the new way of life they experienced during a time of radical changes within society.

The Lost Generation Corpus, at its core, is a project focused on setting up the resources for future research on the Lost Generation. The project collects, prepares, and saves the output of the authors in a computer-searchable text collection, a corpus, which allows materials to be sorted according to authorship, time of publication and text type. This dataset can then be explored with the myriad of tools available to us through corpus linguistics, allowing for insights into how the literary styles of the authors developed and changed throughout their careers.

The current sub-corpora in the project are:

  • Ernest Hemingway – 65 texts (Sundberg & Nilsson 2017)
  • F. Scott Fitzgerald – 132 texts + non-fiction (Sundberg 2018)
  • Gertrude Stein – 69 texts + non-fiction (Spring 2020 Corpus Methods in Practice course, MELL Programme)
  • William Faulkner – 29 books (Currently mixed contents) (Ihrmark, 2023)

The project is currently testing a stand-alone interface for the F. Scott Fitzgerald sub-corpus. The intention is for the interface to enable teachers to use basic computational analysis as a part of their exploration of F. Scott Fitzgerald together with their students.

The interface is available for Windows and Mac here: