READ revolutionizes access to handwritten documents

From the Middle Ages to today, from old Greek to modern English, from running text to tables or forms

About

READ's mission is to revolutionize access to archival documents with the support of cutting-edge technology such as Handwritten Text Recognition (HTR) and Keyword Spotting (KWS).

Learn more

Network

READ addresses archives and libraries, humanities scholars, family historians, volunteers - and computer scientists

Learn more

Research

Research in READ comprises exciting fields such as Artificial Intelligence, Pattern Recognition, Machine Learning and Natural Language Processing.

Learn more

Services

READ technology is available via the service platform Transkribus. Upload documents, train a Handwritten Text Recognition (HTR) model, process text and follow the progress of the project.

Learn more

Recent Posts

+ Handwritten Text Recognition at the National Archives of Finland

In the past 3 years research groups and archives from all over Europe were working on Handwritten Text Recognition for historical documents. Results can now be seen at the public Transkribus seminar at the National Archives of Finland in Helsinki on Wednesday 26.6.2019!

The Transkribus platform enables non-technical users to train neural networks in order to recognize and search historical documents. The seminar will provide an update on latest technical developments and showcase how Transkribus can be used in various scenarios. Moreover, a first version of a web-interface for searching Finnish Court records from the 19th century will be launched. With this search interface users can search historical documents in a “Google like” way.

The READ project is currently in the transformation to become one of the first European Cooperative Societies in the research, education and cultural heritage domain. Institutions and private persons are warmly invited to join this initiative.

If you would like to take part, please register yourself under the following link (participation is free of charge and registration is possible until 18.6.2019): https://www.eventbrite.com/e/transkribus-seminar-at-the-national-archives-of-finland-tickets-61567839064

The program includes an inspiring set of presentations from our international partners, as well as lunch and a panel discussion:

10.00. Welcoming words

10.15. READ-COOP: Günter Mühlberger (UIBK)

      Transkribus and the technology behind it

10.45. Transkribus platform: Sebastian Colutto (UIBK)

11.15. HTR in READ and Transkribus: Roger Labahn, Gundram Leifert (URO and CITlab)

11.30. Segmentation tools: Sofia Ares Oliveira (EPFL)

11.45. Table recognition: Hervé Déjean (NAVER)

12.00. ScanTent and DocScan: Matthias Wödlinger (CVL)

12.15- 13.15 Lunch

      Transkribus in practice

13.15 Edelfelt project: Maria Vainio-Kurtakko (SLS)

13.45 VeleHanden: Marc Ponte, Jirsi Reinders

14.15 Court Records Collection: (NAF and UPVLC)

15.00. Panel discussion

Source: https://pixabay.com/photos/helsinki-city-night-finland-1269310/

+ HTR+ reads old Slavonic documents with 3-5 % Character Error Rate

Recently our new HTR+ was tested on different styles of Church Slavonic handwritings by Achim Rabus, who is holding the Chair of Slavic Linguistics at the University of Freiburg in Germany. With Transkribus’ technology the error rates went down to 3 to 5 percent. Superscript letters, abbreviations and word separation are the challenges the HTR+ had to deal with.

A paper on the topic of recognizing handwritten text in Slavic manuscripts with Transkribus is about to be publicised by Achim Rabus. Within this project he discovered the potential of Transkribus when it comes to the digitizing of Church Slavonic manuscripts: the possibility to search in big documents without even having a special model for the individual handwriting and the opportunity to avoid a full manual transcription and instead just correcting the mistakes of the automated transcription makes “digitisation-life” a lot easier.

Part of the models Achim Rabus has trained already contain different hands and provide useful automatic transcripts. Nevertheless the READ-Team is working on further improving Transkribus in the way, that also for documents with mixed handwritings automatic transcripts with low character error rate can be produced.

Cooperation is the key for getting out the biggest benefit for everybody. That is also what Achim Rabus is convinced of and therefore he is happy to share his model with interested people. You can get in touch with him via email: achim.rabus@slavistik.uni-freiburg.de

You can have a look at the draft of the paper Recognizing handwritten text in Slavic manuscripts: A neural-network approach using Transkribus under the following link: https://www.academia.edu/38835297/Recognizing_handwritten_text_in_Slavic_manuscripts_A_neural-network_approach_using_Transkribus_1_Achim_Rabus

Source: Rabus, Achim: Recognizing handwritten text in Slavic manuscripts: A neural-network approach using Transkribus