Towards Text-Line Segmentation of Historical Documents Using Graph Neural Networks

Kartik Chincholikar; Kaushik Gopalan; Mihir Hasabnis

Towards Text-Line Segmentation of Historical Documents Using Graph Neural Networks

Kartik Chincholikar, Kaushik Gopalan, Mihir Hasabnis

Published: 02 Mar 2026, Last Modified: 11 Mar 2026ICLR 2026 Workshop GRaM PosterEveryoneRevisionsBibTeXCC BY 4.0

Track: long paper (up to 8 pages)

Keywords: Historical documents, Text Line Segmentation, Graph Neural Networks, Domain Shift, Handwritten Text Recognition

TL;DR: We present an initial investigation into a Graph Neural Network (GNN) friendly problem formulation for performing text-line segmentation of historical documents by using geometric priors of text-lines

Abstract: We present an initial investigation into a graph-based problem formulation for performing text-line segmentation of historical documents, by representing characters (or grapheme clusters) as the nodes, and with edges connecting characters to their previous and next characters on the text-line. This converts the image segmentation learning task into a binary edge classification learning task. This also enables training on large-scale synthetic data simulating complex layouts, enabling better robustness to Layout-level distribution shifts observed in historical documents. Furthermore, we introduce a benchmark dataset of 15 Sanskrit manuscripts with diverse layouts. We propose a method based on CRAFT and Graph Neural Networks (GNNs), which uses geometric priors of text-lines to perform competitively with leading approaches in zero-shot and few-shot experimental settings on the Sanskrit dataset introduced and the U-DIADS-TL dataset. The proposed method further demonstrates competitive accuracy and better consistency than leading methods Doc-UFCN and SeamFormer when evaluating robustness to distribution shifts over increasing data sizes (using intra-manuscript and inter-manuscript train–test data splits) on the Sanskrit dataset introduced and the DIVA-HisDB dataset. Finally, we demonstrate that the proposed method achieves strong performance in the downstream, goal-oriented evaluation of text recognized from the segmented text-lines. The dataset, training, and inference code is available at: https://github.com/flame-cai/gnn-synthetic-layout-historical/tree/gram-submission

Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.

Submission Number: 121

Loading