Exploiting Spatial Attention and Contextual Information for Document Image Segmentation

Yuman Sang, Yifeng Zeng, Ruiying Liu, Fan Yang, Zhangrui Yao, Yinghui Pan

Published: 2022, Last Modified: 16 May 2023PAKDD (3) 2022Readers: Everyone

Abstract: We propose a new framework of combining an attention mechanism with a conditional random field to deal with a document image segmentation task. The framework aims to recognize homogeneous regions, e.g. text, figures, or tables, in document images through a pixel-wise spatial attention module. The attention module obtains essential global information and gathers long-distance pixel dependencies. To get extra knowledge around images, we use a conditional random field to model contextual information in the document. The new framework enables an effective combination of pixel features with their contextual information in the document image segmentation task. We conduct extensive experiments over multiple challenging datasets and demonstrate the performance of our new framework in comparison to a series of state-of-the-art segmentation methods.

0 Replies