READ revolutionizes access to handwritten documents
From the Middle Ages to today, from old Greek to modern English, from running text to tables or forms
We are excited about our next Transkribus User Conference, which this time will take place in Innsbruck. Preparations are in full swing and the date will be announced within the next few weeks.
Our last Transkribus User Conference in November 2018 was a great success, but there is always room for improvement. If you have some input to share on topics for the next conference or other suggestions for improvement, we are happy to receive them via firstname.lastname@example.org.
The program of our last user conference can be found here: https://read.transkribus.eu/transkribus-user-conference-2018/
… videos of the presentations in our Youtube-channel: https://www.youtube.com/playlist?list=PL7UbQtd4qlhKCEgLnZbJKQu9qpA5iF-sC
Hope to meet you there!
Members of the Zurich University compared two versions of the ABBYY FineReader (FineReader XIX and FineReader Server 11) OCR (Optical Character Recognition) and the Transkribus HTR (Handwritten Text Recognition) in order to find out which one is the most effective one when it comes to recognition results on black letters in historical newspapers. For the test they used PDFs with medium resolution images of the German-language Neue Zürcher Zeitung.
The recognition of black letters in historical newspapers can be particularly challenging because the distinctiveness of characters is often low, the paper quality can be bad and, in many cases, small font sizes are used. Systems like ABBYY FineReader and Transkribus are working on tackling such problems. We are happy that the experiment of the University of Zurich shows that Transkribus provides significantly better results than the commercial system ABBYY FineReader.
The article explains the effectiveness of the HTR, as only a modest amount of manual work is needed for the creation of ground truth, which makes it possible to apply the HTR on documents. Especially with printed texts in newspapers, error rates in Transkribus are usually low. Moreover, the test shows that the model, which had been trained for the Neue Zürcher Zeitung, also provided good results for other newspapers of the same epoch, like the Bundesblatt and the Neue Zuger Zeitung. Good news is, that the model of the Neue Zürcher Zeitung will become public during 2019.
If you would like to have a closer look on the experiment, you can find the whole article here: https://dev.clariah.nl/files/dh2019/boa/0694.html