ICDAR 2024 Competition on Multi Font Group Recognition and OCR

Janne van der Loop, Florian Kordon, Martin Mayr, Vincent Christlein, Fei Wu, Dalia Rodríguez-Salas, Nikolaus Weichselbaumer, Mathias Seuret

Published: 2024, Last Modified: 02 Mar 2025ICDAR (6) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: This competition investigates the performance of several methods for two types of analyses of early modern prints: (1) optical character recognition, and (2) font recognition at the character level. We have created and published a novel dataset that contains the ground truth for both tasks. The dataset has been carefully curated and annotated by an expert with several years of expertise in transcribing early modern prints. Both tasks involved two distinct tracks, differing in ground truth management: one that only allows the participants to use the provided data for model training and a second that removes this restriction. Out of the five participating teams, four participated in the first track, and three in the second one. The best team reached a text Character Error Rate (CER) of 0.82 % and a font CER of 2.96 % for the first track. In the second track, these numbers could be slightly improved to 0.81 % text CER and 2.78 % font CER.