Adversarial Diffusion Network for Dunhuang Mural Inpainting

Jing Lian, Jibao Zhang, Shiqiang Du, Qidong Liu, Jizhao Liu

Published: 01 Jan 2025, Last Modified: 30 Jul 2025IEEE Trans. Circuits Syst. Video Technol. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Dunhuang mural inpainting aims to fill in the missing regions of damaged murals with realistic content. Denoising probabilistic diffusion model (DDPM) has made great strides in semantic generation and shown promising results in image inpainting. However, three potential challenges prevent existing diffusion-based methods from restoring the Dunhuang murals: 1) effective visual information cannot be accurately extracted due to historical reasons, with most of the pixels being faded; 2) there are semantic discrepancy between damaged and visible regions in the inpainting results; and 3) the original structure and style of the damaged regions cannot be adequately restored. To this end, we propose a novel adversarial diffusion model for mural inpainting, which consists of: 1) a mural enhancement module named pixel-enhanced fire-controlled pulse-coupled neural network (PEFCPCNN), designed to enhance faded pixels to accurately extract the visual features of the mural; 2) a novel adversarial diffusion framework that optimizes the sampling prediction of mural over time steps; and 3) line drawing and different loss functions to constrain the reconstructed content to approximate the structure and style of original mural. The variational transform layer (VTL) and multi-scale contextual feature aggregation (MCFA) module are proposed to reconstruct content that is structurally coherent and texturally reasonable. Experiments on the Dunhuang mural dataset demonstrate that the proposed method outperforms state-of-the-art methods in terms of both the semantic reasonableness and global semantic consistency of inpainting content.