A magnifying glass placed over handwritten text
-
Open Lecture; Workshop

Handwritten Text Recognition (HTR) for Historical Documents: Applying HTRflow, Exploring New Possibilities, and Addressing Remaining Challenges

Join us for a two-day event dedicated to exploring the possibilities and challenges of Handwritten Text Recognition (HTR) in the analysis of complex historical documents.

The National Archives of Sweden (Riksarkivet) holds vast digital collections of handwritten materials, including letters, registers, official records, and more. While these materials have been digitized, much of the content remains accessible only as images.

Traditional Optical Character Recognition (OCR), which is designed for printed text, is limited in its capacity to process this material. Handwritten Text Recognition (HTR) technology, however, can interpret the nuances and variability of human handwriting, making it possible to automatically convert handwritten text into machine-readable data. HTR can therefore open up digital collections of documents to large-scale search, analysis, and other possibilities. However, historical documents are rarely uniform. Many collections contain complex layouts, inconsistent structures, and challenging handwriting styles.

This event, led by Riksarkivet data scientists, focuses on addressing these complexities and expanding the practical applications of HTR. During the workshops, participants will work with Google Colab notebooks built around Riksarkivet’s Python package, HTRflow, allowing them to follow along during demonstrations and experiment independently.

The event will also showcase Riksarkivet’s ongoing HTR initiatives, featuring examples ranging from simple handwritten running text to more complex document layouts.

Event programme:

April 21, Morning (9:00-11:30 CET)
Presentation and Discussion: An introduction to HTR in historical research, with a focus on methodological challenges and opportunities.

April 21, Afternoon (13:30-16:00 CET)
Workshop I: Hands-on work with HTRflow using guided exercises.

April 22, Morning (9:00-11:30 CET)
Workshop II: Advanced workshop session exploring more complex cases and tasks.

This is event is hosted with support through the Machine Learning for Difficult Digitizations (MaLDD): Old Maps and Beyond project, funded the LNU's Faculty of Arts & Humanities’s Sara Lisa Initiative. This is also a Huminfra event; Huminfra is Sweden’s national infrastructure supporting digital and experimental research in the humanities, of which both LNU and Riksarkivert are members. Lastly, this event is provided through the Centre for Digital Humanities at LNU.

The event will be held in English.

-- How to register --
Participation is free of charge to this event, but participants are requested to register. Please sign up here: https://forms.office.com/e/UhtJYMNbHR

 

About presenters

Erik Lenas is lead data scientist at The Swedish National Archives, studied computer science and literature, and has worked at The National Archives since 2020, mainly with different HTR-projects.

Viktoria Löfgren holds a MSc in computer science, data scientist at the Swedish National Archives since 2024, and is the main developer of HTRflow.

Pontus Henningsson is a current master's student in Digital Humanities at the Linnaeus University, with a BSc in Computer Science and a MSc in Information Science focused on Natural Language Processing/computational linguistics. Having done an internship at the Swedish National Archives (Riksarkivet) in the autumn of 2025, focusing on developing Named Entity Recognition (NER) models for Old Icelandic texts and Handwritten Text Recognition (HTR) models for Old Swedish manuscripts, he is now writing his master's thesis on NER for Old Swedish at the Swedish National Archives.

For further information:
https://www.huminfra.se/
https://lnu.se/en/research/research-groups/digital-humanities/
https://riksarkivet.se/
https://huggingface.co/Riksarkivet

- N1017V (Växjö); Zoom Ahmad M. Kamal Add to your calendar