The core objective of READ is to provide a service platform (Transkribus) for the automated recognition, transcription and searching of historical documents. The basic technology is language and alphabet independent which means that we are aiming at documents from the Middle Ages to the 20th century, from old Greek to modern English and from simple running text to sophisticated tables and forms.
The Transkribus service platform is fully available. More than 20,000 users (as of March 2019) have registered and downloaded the Transkribus expert interface. The Transkribus platform is accessible via several interfaces, the most important are:
- Transkribus expert client
- Web Interface
- Mobile Apps
- Page Image Explorer
- e-Learning Interface
- Crowd-sourcing interface (forthcoming)
Services and Tools
All services provided by READ rely on algorithms and tools developed as part of basic research and innovation. These tools are available to users via the Transkribus platform and can also be accessed as Open Source via the Transkribus Github repository:
- Handwritten Text Recognition
- Keyword Spotting (Query by Example, Query by String)
- Layout Analysis
- Forms and Table Recognition
- Automatic Writer Identification and Retrieval
- Correction and editing tools
- Text2Image matching tool
- e-Learning tool
- Export services covering a wide range of standardized formats
One of the main objectives is to establish Transkribus as Research Infrastructure for humanities scholars, archives, libraries, public users (family historians, etc.) and computer scientists. The main reason to run a platform is simple: Handwritten Text Recognition and other tools based on machine learning rely heavily on training data. If such training data is collected centrally every user will benefit from the work of every other user – without a need to share documents or collections directly.
Therefore a business model will be developed in order to sustain the platform after the end of the project. Our main aim is here to find a fair balance between free services offered to everyone and remunerated services for processing large amounts of documents or adapting the software to the needs of specific users. Find out more about the future of the READ project.