READ revolutionizes access to handwritten documents
From the Middle Ages to today, from old Greek to modern English, from running text to tables or forms
Members of the Zurich University compared two versions of the ABBYY FineReader (FineReader XIX and FineReader Server 11) OCR (Optical Character Recognition) and the Transkribus HTR (Handwritten Text Recognition) in order to find out which one is the most effective one when it comes to recognition results on black letters in historical newspapers. For the test they used PDFs with medium resolution images of the German-language Neue Zürcher Zeitung.
The recognition of black letters in historical newspapers can be particularly challenging because the distinctiveness of characters is often low, the paper quality can be bad and, in many cases, small font sizes are used. Systems like ABBYY FineReader and Transkribus are working on tackling such problems. We are happy that the experiment of the University of Zurich shows that Transkribus provides significantly better results than the commercial system ABBYY FineReader.
The article explains the effectiveness of the HTR, as only a modest amount of manual work is needed for the creation of ground truth, which makes it possible to apply the HTR on documents. Especially with printed texts in newspapers, error rates in Transkribus are usually low. Moreover, the test shows that the model, which had been trained for the Neue Zürcher Zeitung, also provided good results for other newspapers of the same epoch, like the Bundesblatt and the Neue Zuger Zeitung. Good news is, that the model of the Neue Zürcher Zeitung will become public during 2019.
If you would like to have a closer look on the experiment, you can find the whole article here: https://dev.clariah.nl/files/dh2019/boa/0694.html
On 1st of July 2019 the READ project will turn into a European Cooperative Society (SCE). READ-COOP will serve as the basis for sustaining and further developing the Transkribus platform and related services and tools.
READ-COOP will be based on the EU directive of a European Cooperative (SCE). Though the SCE will be set up according to EU law it will be open to members outside of the European Community as well. If you are interested in working with Transkribus on a long run – join READ-COOP and benefit from the work done by your collaborators.
One of the main reasons that we decided to go for a coop is that we want to support a “culture of collaboration” between archives/libraries, humanities scholars, computer scientists and the public (volunteers). We believe that intersectoral collaboration and full control over data are key for a successful integration of machine learning technologies into society and daily life. And an SCE delivers the best infrastructure to realize this goal.
An SCE is a legal entity which is open to new members (institutions, natural persons). Members shall benefit from an SCE directly, there is no shareholder value. Moroever SCEs are organised in a democratic way: The final say has the General Meeting.
More information can be found here: https://read.transkribus.eu/coop/