Efficient Joint Rectification of Photometric and Geometric Distortions in Document Images

Published: 2024, Last Modified: 07 Nov 2025ICASSP 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Document images captured with cameras often exhibit photometric and geometric distortions. Here, we propose a novel learning-based approach for efficient joint rectification of document images. Inspired by the strong correlation between visual shadows and physical deformations, we design a shared encoder architecture to fully leverage structured document features. A cross-attention module is introduced to facilitate information exchange between deformation and coordinate domains. Our method effectively addresses both geometric and photometric distortions in an end-to-end manner, making it highly valuable for applications involving camera-captured document images.
Loading