Abstract: As masked face images can significantly degrade the performance of face-related tasks, face mask removal remains an important and challenging task. In this paper, we propose a novel learning framework, called MaskDiffuse, to remove face masks based on Denoising Diffusion Probabilistic Model (DDPM). In particular, we leverage CLIP to fill the missing parts by guiding the reverse process of pretrained diffusion model with text prompts. Furthermore, we propose a multi-stage blending strategy to preserve the unmasked areas and a conditional resampling approach to make the generated contents consistent with the unmasked regions. Thus, our method achieves interactive user-controllable and identity-preserving masking removal with high quality. Both qualitative and quantitative experimental results demonstrate the superiority of our method for mask removal over alternative methods.
Loading