+ Crowdsourced digitisation with DocScan

At the READ project, we believe in using cutting-edge technology to help people study a rich variety of historical documents.

The Computer Vision Lab at the Technical University of Vienna (one of the READ project partners) have developed the DocScan mobile app for this very purpose.

The app automatically detects the page area of a document in milliseconds and provides real-time feedback on the quality of the image. It also has an auto-shoot feature which will take a picture every time a page is turned. It works especially well when used alongside our ScanTent device (also developed by the Computer Vision Lab), which holds a mobile phone in place above a historical document and allows for hands-free scanning.

The remarkable potential of these tools was revealed at our recent Transkribus User Conference. Dirk Alvermann from the University of Greifswald Library (one of READ’s MOU partners) presented the results of an experiment his team had been conducting.

The idea is that archival users can work with DocScan and the ScanTent to digitise historical documents with their mobile phone and then share the resulting images directly with the archive.

As shown in the below video, Greifswald University Library assigned a QR code to a set of documents and asked users to scan this code with the DocScan app before they started digitising documents. Once images were scanned using DocScan and the ScanTent, they were uploaded to our Transkribus platform and became available to view and transcribe in the Transkribus Web interface. The library was then able to create links in its digital repository, connecting archival metadata with the digitised images on Transkribus Web. A future version of DocScan will make it easier for images to be ingested directly into archival systems.

Dirk Alvermann emphasised that this workflow could be incredibly beneficial for small archives who lack funding for digitisation.  Whilst user-generated content is not a substitute for a full digitisation strategy, it has the advantage of creating new resources and engaging interested archival users.

The DocScan app is available to download now, free of charge. The ScanTent is still in development and will be available for sale and hire later in 2019.

+ Transkribus around the globe – coverage on TVNZ

Our Transkribus platform for Handwritten Text Recognition (HTR) is used by thousands of researchers and archivists all over the world. And we’ve just been featured on the news on the New Zealand television network TVNZ.

Archives New Zealand explained how they have been experimenting with Transkribus to recognise different historical collections. They underlined some of the huge benefits of working with Transkribus – transcription is sped up, physical documents receive less wear and tear and most importantly, historical treasures become much easier to access!

+ Medievalists! Share data with our working group to improve Handwritten Text Recognition

With thousands of Transkribus users working all over the world, there is huge potential for collaborative work on the automated recognition of historical documents.

Dr Tobias Hodel (State Archives of Zurich, University of Zurich) has set up the ‘Gothic Hands’ working group with this mind, hoping to improve recognition of medieval Gothic script. The ‘Comb_Gothic_Bookwriting’ model has been trained on different sets of medieval scripts and is already available to all Transkribus users. In the best cases, it can produce automated transcripts with a Character Error Rate of less than 10%.

We are looking for more users to join this working group and share images and transcripts of Gothic scripts written between the 11th and 15th century. The current model has been trained primarily on German language material, so we are especially keen to receive documents written in Latin.  

The latest contributor to the working group is Digital Statius: The Achilleid, a project which is producing a digital edition of Achilleid, an unfinished epic poem written in Latin, in which the poet Statius (later 1st c. AD), narrates the childhood of Achilles and the stay of the hero on the island of Scyros. This text was part of the school curriculum in the Middle Ages, before losing its status as a classic. The project, funded by the Swiss National Science Foundation (SNSF) and based at the University of Geneva, aims to produce a new critical edition of the Achilleid, fully and exclusively digital, which takes into account the complete manuscript tradition of the poem (224 manuscripts, c. 8000 images). The open access digital critical edition will include a new text, a full interactive apparatus criticus, comparative visualization of numerous readings, comments, translations, links to other tools and/or platforms, and the images of the largest possible number of manuscripts.

If you work with Gothic script, you can join the team behind the Achilleid edition and many others by becoming part of the ‘Gothic Hands’ working group.

To participate in the group, you can:

  • share existing training data that you have already prepared in Transkribus
  • prepare new images and transcripts in Transkribus in the ‘Gothic Hands’ collection
  • send over files containing images and transcripts which can be matched automatically and converted into training data

Please contact Tobias Hodel (tobias.hodel@hist.uzh.ch) with any questions about the group.

Working together gives us a great chance to transcribe and search medieval documents more efficiently!

+ READ on the move to READ-COOP

The READ consortium together with several other institutions is currently preparing the foundation of a legal entity (working title: READ-COOP) which will serve as the basis for sustaining and further developing the Transkribus platform and related services.

The governance model will be based on the EU directive for European Cooperative Societies (SCE). Though the SCE will be set up according to EU law it will be open to members outside of the European Community as well.

An SCE

  • is a legal entity that allows its members to carry out common activities, while preserving their independence
  • has the principal objective of satisfying its members’ needs and not the return of capital investment
  • allows members to benefit proportionally to their profit and not to their capital contribution.

Read more

+ Podcast with READ project coordinator

Günter Mühlberger, coordinator of READ and head of the Digital Humanities Research Center at the University of Innsbruck has recently been interviewed on a new podcast (in German).

The interview was recorded by the NewsEye project which like READ, is funded by European Union’s Horizon 2020 scheme. NewsEye aims to use digital tools to provide enhanced access to digitised historical newspapers and the project will build upon READ’s existing achievements in relating to the automated recognition of printed text.

+ Learn how to add structural tags to documents in Transkribus

We have another new How to Guide for users of our Transkribus platform.  This time we’re showing you how to enrich documents with structural tags like ‘paragraph’, ‘heading’, or ‘footer’.

In the near future, it will be possible to train models to automatically recognise the structure of historical documents.  Adding structural tags creates training data for this process.  If you work with this feature, there is no need to tag every element of your documents – just focus on marking up the sections that are of interest to you.

If you have any questions about structural tags, the Transkribus team are here to help (email@transkribus.eu)

+ ScanTent makes it to Mali, West Africa!

Prototypes of the ScanTent, our device for digitising documents with a mobile phone, have been popping up all over Europe over the past year. And in December 2018, the first ScanTent made it to Africa!

Dr Vincent Hiribarren (King’s College London) took the tent to the town of Kita in western Mali to try it out before using the professional equipment (cameras, tripod, scanner) provided by the Endangered Archives Programme project called ‘Recovering the rich local history of Kita (Mali) through the salvaging of its archival heritage’.  This grant is held and directed by Dr Marie Rodet (SOAS, University of London).

The Endangered Archives Programme at The British Library awards annual grants to preserve archival material that is at risk of destruction or neglect. This funding means that endangered archival collections can be transferred to new homes, digitised and deposited at local institutions and in The British Library.

The ScanTent is a portable piece of equipment which holds a user’s phone in place above a historical document, providing a consistent source of light and leaving users with their hands free to turn pages or move documents around.  The advantages of the ScanTent become even greater when it is used in conjunction with our DocScan mobile app.  DocScan automatically detects the page area of a document and provides real-time feedback on image quality.  It also has an auto-shoot feature which will take a photo every time a page is turned.  Transkribus users can upload images to the platform directly from DocScan and these images can then be used for training an Automated Text Recognition model.

Dr Hiribarren installed DocScan on his phone in advance of his trip and was then able to set up the ScanTent quickly on location in Kita and start scanning! This experiment really shows that these tools have huge potential to open up access to unique collections of historical material all over the globe.

DocScan is available now, free of charge.  The ScanTent is still in development and units will be available for sale and hire later in 2019.

Testing out the ScanTent in Kita (Mali). Image credit: Vincent Hiribarren.

Find out more:

+ Latest Transkribus video tutorials

We’re celebrating the New Year with the release of several video tutorials designed to help new users navigate our Transkribus platform.

If you have a few minutes, you can get a nice overview of stages needed to automatically recognise and search handwritten and printed historical texts in Transkribus.

How to use Transkribus in 10 steps

Segmentation

Training a Handwritten Text Recognition model

Using Handwritten Text Recognition models

Keyword Spotting

If you need extra help with Transkribus, please check out our detailed How to Guides.

+ Meet the READ project partners Johanna Walcher

What’s your name?

Johanna Walcher

 

 

 

 

 

 

 

 

 

 

Where do you work?

The University of Innsbruck.

Tell us a bit about your background…

I did a bachelor’s degree in Transcultural Communication for Italian and Russian Language. Currently I am working towards my master’s degree in Media Studies and also study Philosophy. In my leisure time I love doing sports, preferably in the mountains. Hiking, running, mountain biking, skiing and yoga are great! But I can also sit still and spend a whole day reading books and newspapers. Whenever I can I travel the world and spend as much time as I can with my friends and family.

What is your role in the READ project?

In the READ project I am responsible for the Transkribus How to Guides, which should make life easier for users. I also run Transkribus webinars where I present the READ project and show interested people how to use the platform. As a member of the Dissemination Working Group, I am involved in spreading the word about the READ project.

What is at the top of your to-do list at the moment?

My current project is recording short screencast videos, which explain the different steps of the Transkribus workflow.  The first ones are already up on our YouTube channel soon.

What do you like best about working on READ?

All the interesting projects around READ and the inspiring people I work with.

If you could do another job for just one day, what would it be?

Good question! There are too many jobs in the world I would love to try and it would take me too much time to choose.  Working hours are expensive, so I will rather save READ some money and do my actual job, which I like a lot! 🙂

What can you see out of the window of your office? 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Thanks Johanna!