A Robust Document Localization Solution with Segmentation and Clustering

Published: 01 Jan 2023, Last Modified: 22 May 2025IEA/AIE (1) 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: In the fields of optical character recognition and textual information extraction, document localization is recognized as a potential preprocessing step with a significant impact on accuracy. Despite numerous solutions being presented, localizing documents in images with complicated backgrounds remains an open issue. This paper offers a novel approach to document localization that may successfully handle difficult scenarios with complicated backgrounds. Our strategy blends deep learning with conventional image processing techniques. Specifically, deep learning is applied to determine a rough region of the document’s boundary, while traditional image processing algorithms are exploited to identify the document’s corners. Moreover, to improve model accuracy and mitigate the data-hungry drawback of the deep learning-based approach, we introduce efficient data annotation and augmentation techniques. We perform comprehensive experiments to evaluate the performance of the proposed method on the ICDAR 2015 SmartDoc challenge 1 dataset. The experimental results show that our method achieves higher accuracy while requiring less real training data. Specifically, using only \(20\%\) of the real training data, our proposal improves the Jaccard Index by \(0.6\%\) on average and by \(1.7\%\) concerning the dataset with the most complicated background.
Loading