ImpResDescan: Diffusion-Based Restoration for Scanned Document Images via Implicit and Ambient Training

Ali Haider; Euijune Lee; Sungyoung Lee; Chaoning Zhang; Sung-Ho Bae

ImpResDescan: Diffusion-Based Restoration for Scanned Document Images via Implicit and Ambient Training

Ali Haider, Euijune Lee, Sungyoung Lee, Chaoning Zhang, Sung-Ho Bae

16 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Descanning, Diffusion, Image restoration

TL;DR: ImpResDescan is a trainable framework that restores scanned images with color shifts, distortions, and misalignments using learned color correction and residual diffusion, achieving state-of-the-art results on the DESCAN-18K dataset.

Abstract: We propose ImpResDescan, a distortion-aware descanning framework that restores high-quality digital images from scans degraded by nonlinear color shifts, local artifacts, and geometric misalignments introduced by print–scan pipelines. Unlike the latest work, i.e., DescanDiffusion+, which applies linear channel-wise distribution correction and assumes perfect alignment, ImpResDescan removes those handcrafted assumptions through two components: (i) an implicit color correction module that couples a global encoder with a pixel-wise implicit mapper to learn a scan-conditioned, per-pixel nonlinear color transformation directly from data; and (ii) a residual local refinement module trained with an ambient strategy that is robust to spatial and semantic misalignment by supervising only keypoint-aligned regions and regularizing global structure with Multiscale Sliced Wasserstein Distance (MS-SWD). The residual local refinement module is additionally conditioned on a degradation-aware encoder, enabling robust removal of localized artifacts with low computational overhead and delivering up to 2× faster inference than DescanDiffusion+. Extensive experiments across multiple datasets including DESCAN-18K (18,000 scan-original pairs) show that ImpResDescan consistently outperforms related restoration models in both fidelity and perceptual quality. On DESCAN-18K, it surpasses DescanDiffusion+ by +0.9 dB PSNR, -3.43 FID, and -18.6% LPIPS.

Primary Area: generative models

Submission Number: 7121

Loading