Abstract: The development of deep learning has greatly improved the image inpainting performance in the past decades. In fact, image inpainting for different tasks usually requires different models. For instance, the highly structured images need to be compensated the structural consistency, and the textured ones need to be reconstructed the local high-frequency details. However, it is still challenging to realize an effective algorithm taking account of the global structure and texture details separately. Herein, we proposed a two-stage inpainting method, combining the information of frequency and spatial domain. Stationary Wavelet Transform (SWT) with good time-frequency characteristics was applied to obtain the sub-band images as the basic inputs for frequency-domain inpainting. Contextual Attention Layer (CAL) modules were optionally introduced in the network to adapt to various inpainting tasks. We also tested and discussed the impacts of some commonly used loss functions, including normal L1 loss, normal GAN loss, weighted L1 loss, and WGAN-GP loss, on highly structured and textured images.
Loading