Abstract: Blind image inpainting is extremely challenging due to the unknown and multi-property complexity of contamination in different contaminated images. Current mainstream work decomposes blind image inpainting into two stages: mask estimating from the contaminated image and image inpainting based on the estimated mask, and this two-stage solution involves two CNN-based encoder-decoder architectures for estimating and inpainting separately. In this work, we propose a novel one-stage Transformer-CNN Hybrid AutoEncoder (TransCNN-HAE) for blind image inpainting, which intuitively follows the inpainting-then-reconstructing pipeline by leveraging global long-range contextual modeling of Transformer to repair contaminated regions and local short-range contextual modeling of CNN to reconstruct the repaired image. Moreover, a Cross-layer Dissimilarity Prompt (CDP) is devised to accelerate the identifying and inpainting of contaminated regions. Ablation studies validate the efficacy of both TransCNN-HAE and CDP, and extensive experiments on various datasets with multi-property contaminations show that our method achieves state-of-the-art performance with much lower computational cost on blind image inpainting. Our code is available at https://github.com/zhenglab/TransCNN-HAE.
0 Replies
Loading