Abstract: We propose a novel diffusion-based framework for face swapping, called DiffFace. Unlike previous GAN-based models that inherit the challenges of GAN training, ID-conditional DDPM is trained during the training process to produce face images with a specified identity. During the sampling process, off-the-shelf facial expert models are employed to ensure the model can transfer the source identity while maintaining the target attributes such as structure and gaze. In addition, the target-preserving blending effectively preserve the expression of the target image from noise, while reflecting the environmental context such as background or lighting. The proposed method enables controlling the trade-off between ID and shape without any further re-training. Compared with previous GAN-based methods, DiffFace achieves high fidelity and controllability. Extensive experiments show that DiffFace is comparable or superior to the state-of-the-art methods.
Loading