Lightweight Dual Attention Multi-Scale Inverted Residual Neural Network for Image Inpainting

Kuan-Hsien Liu, Chun-Chieh Chang, Tsung-Jung Liu

Published: 2025, Last Modified: 21 Apr 2026SMC 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We propose DA-MSIRNet, a lightweight yet innovative architecture for high-quality image inpainting that significantly enhances the standard U-Net through four key innovations: (1) Context Anchor Attention (CAA) for efficient global context modeling via adaptive region selection, (2) Sparse Self-Attention (SpA), inspired by Spa-former, for dynamic and precise local detail refinement by focusing on salient relationships, (3) Multi-Scale Inverted Residual (MSIR) modules for enhanced multi-scale feature fusion through optimized skip connections, and (4) Structural Similarity (SSIM) Loss for improved perceptual quality and fidelity. DA-MSIRNet effectively addresses critical limitations of existing methods, including GAN instability, U-Net’s restricted receptive field, and Transformer computational complexity. Comprehensive evaluations on Places2 and CelebA-HQ datasets demonstrate that DA-MSIRNet achieves state-of-the-art performance in both quantitative metrics (PSNR/SSIM/FID) and visual quality, while maintaining superior computational efficiency. The code for our DA-MSIRNet is publicly available on GitHub: https://github.com/nutcliu2507/DA-MSIRNet.
Loading