Decomposing Document Images by Heuristic Search

Published: 2007, Last Modified: 13 Nov 2024EMMCVPR 2007EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Document decomposition is a basic but crucial step for many document related applications. This paper proposes a novel approach to decompose document images into zones. It first generates overlapping zone hypotheses based on generic visual features. Then, each candidate zone is evaluated quantitatively by a learned generative zone model. We infer the optimal set of non-overlapping zones that covers a given document image by a heuristic search algorithm. The experimental results demonstrate that the proposed method is very robust to document structure variation and noise.
Loading