(Universität Innsbruck, Forschungszentrum Digital Humanities, Austria)
Abstrakt: Transkribus is an open platform for the recognition of handwritten historical documents. More than 40,000 users are registered on the platform, several hundred of them work with the software every day. As the only platform worldwide, users can train neural networks themselves and are therefore able to optimize the recognition of handwritten documents for their specific documents.
Transkribus is based on deep learning and works independently of language and alphabet. Medieval documents in Latin can be processed in the same way as letters from WWI in English and German, or Hebrew and Arabic. More than 6000 models have already been trained by the users. In total, over 17 million pages have been uploaded to Transkribus for processing.
Transkribus was developed in an EU research project led by the University of Innsbruck. Following the project, the European Cooperative READ-COOP SCE was founded in 2019. READ-COOP now has more than 60 members, including renowned archives, libraries and universities from all over the world.
In this talk we will discuss the opportunities which are offered by Transkribus to everyone interested in recognizing historical documents. This comprises not only text recognition and training of neural networks, but also the searching collections via full-text and keyword spotting. Moreover we will report about new developments for machine learning based layout analysis as well as making recognized documents available for the public.