Abstract: The separation of handwritten and printed text in document images is an important task in the optical character recognition (OCR) research field. It is still a challenging problem to separate overlapped handwritten and printed text lines in images of complex documents including examination papers, legal documents, etc. In this paper, handwritten and printed text separation is formulated as a pixel-level document image segmentation task. Firstly, a modified Transformer-based model is designed for pixel-level document image segmentation. Secondly, a residual feature bypass is incorporated into the model to further exploit high-resolution features. Finally, a loss function combining focal loss and dice loss is designed to tackle the problem of imbalanced distributions of different classes. Experimental results on both a public English document image dataset and a self-built Chinese document image dataset have demonstrated the effectiveness of the proposed method.
Loading