+ Recognising eighteenth-century legal records at Middle Temple

The Honourable Society of the Middle Temple is one of four Inns of Court: prestigious professional associations for barristers working in England.

The archive and library of Middle Temple holds records of the Inn from the early sixteenth century onwards.  The most significant series of these documents are being digitised and made available online.

Middle Temple began exploring Transkribus tentatively in 2016.  The Inn first signed a Memorandum of Understanding with the READ project and then started to explore the possibilities of training Handwritten Text Recognition (HTR) models to recognise documents in their collections.

After discussions about the best documents to start with, they settled on digitised manuscript records of Middle Temple’s governing body or Parliament.  These records dated from 1762-1775 and were written in several very similar hands.

A selection of 101 bifolio pages were uploaded to Transkribus and transcribed by the Transkribus team.  David Woolley QC, a bencher at Middle Temple, then took care of proof-reading and correcting each page to ensure that the transcriptions were as accurate as possible.

These images and transcripts (around 80,000 transcribed words) became training data for generating a HTR model.  Data from the pre-exisiting ‘English Writing M1’ model was also included as part of the training process as a ‘base model’.  The ‘English Writing M1’ model is trained to recognise the writing of the English philosopher Jeremy Bentham (1748 – 1832) and his secretaries – it is freely available to all Transkribus users for their experiments.

The resulting HTR model can produce transcripts of images from the test set with a very low Character Error Rate of 3.31%.  This is an amazing result!  Automated transcripts with such a low error rate immediately become a useful research resource.

Automated transcription of a page from the Middle Temple records.

The team at Middle Temple also created a dictionary based on one of their ‘Bench Books’ which lists recurring names, abbreviations and unusual terms. This dictionary should hopefully improve the quality of the recognition.

Middle Temple is now exploring ways to build on this first great achievement, by making these transcripts available to researchers in a searchable database.

Thanks to Lesley Whitelaw, Barnaby Bryan and David Woolley at Middle Temple and Stuart Dunn at King’s College London for this collaboration.

Posted in HTR models, News, Success stories, Transkribus.