Abstract: Recognizing the place names within textual labels on historical maps is complicated by many factors, such as curvilin- ear baselines and dense overlap with other textual or graphical elements. However, maps’ alignment with known geography and inter-label typographic style consistencies provide strong cues for resolving uncertainty and reducing text recognition errors. We present a unified probabilistic model to leverage the mutual information between text labels and styles and their geographical locations and categories. This work also introduces likelihood functions to model label placement for polyline and polygon geographical features, such as rivers or provinces. We evaluate the methods on 30 maps from 1866–1927. By interleaving automated map georeferencing with text recognition, we reduce word recognition error by 36% over OCR alone. Incorporating category-style links reduces toponym matching error by 32%.
0 Replies
Loading