Sparse self-attention transformer for image inpainting

Wenli Huang, Ye Deng, Siqi Hui, Yang Wu, Sanping Zhou, Jinjun Wang

Published: 01 Jan 2024, Last Modified: 05 Mar 2025Pattern Recognit. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Highlights•To accommodate the long-range modeling capacity of transformer blocks while reducing the computational burden, we introduce a novel U-Net style transformer-based network, called sparse self-attention transformer (Spa-former), to approach the inpainting task.•A transformer block to consider channel attention is adopted to model the global pixel relationships.•We adopt the ReLU function as the activation function to obtain a sparse attention/feature map, where coefficients with low/no correlation are removed from the attention map.•Experiments on challenging benchmarks demonstrate the superior performance of our Spa-former over state-of-the-art approaches.