Abstract: n recent years, document image rectification has seen substantial advancements.
Nonetheless, current leading algorithms are primarily effective for images with clearly
defined document boundaries and some degree of distortion. These algorithms often
struggle when presented with images containing text in only a specific area or with
incomplete boundaries, leading to subpar rectification results. This limitation is par-
ticularly problematic in situations where only sections of a document need processing.
Although there are methods that attempt to address these issues, they frequently en-
counter difficulties when dealing with a combination of intricate distortions and diverse
document layouts. To address this gap, our paper introduces a novel approach for document
image rectification that specifically targets images with partial or missing document
boundaries. Recently, attention-based neural networks have proven highly effective in
enhancing the accuracy and efficiency of document rectification. By utilizing attention
mechanisms, these networks can focus on relevant parts of an image, thereby
improving the rectification outcomes. Our paper presents ’DocAttentionRect’, an innovative
attention-based rectification network that incorporates attention modules alongside par-
allel convolution layers to address complex document image rectification challenges. Our
proposed architecture captures extensive dependencies and key textual and structural fea-
tures throughout the rectification process. DocAttentionRect is capable of handling all
document types, regardless of the visibility of their boundaries. Extensive experiments
conducted on the DocUNet, DIR300, and UVDoc datasets demonstrate the superior
performance and effectiveness of our proposed architecture.
Loading