A Comprehensive Comparison of Open-Source Libraries for Handwritten Text Recognition in NorwegianOpen Website

Published: 01 Jan 2022, Last Modified: 01 May 2023DAS 2022Readers: Everyone
Abstract: In this paper, we introduce an open database of historical handwritten documents fully annotated in Norwegian, the first of its kind, allowing the development of handwritten text recognition models (HTR) in Norwegian. In order to evaluate the performance of state-of-the-art HTR models on this new base, we conducted a systematic survey of open-source HTR libraries published between 2019 and 2021, identified ten libraries and selected four of them to train HTR models. We trained twelve models in different configurations and compared their performance on both random and scripter-based data splitting. The best recognition results were obtained by the PyLaia and Kaldi libraries which have different and complementary characteristics, suggesting that they should be combined to further improve the results.
0 Replies

Loading