CodeSwap: Symmetrically Face Swapping Based on Prior Codebook

Xiangyang Luo; Xin Zhang; Yifan Xie; Xinyi Tong; Weijiang Yu; Heng Chang; Fei Ma; Fei Yu

CodeSwap: Symmetrically Face Swapping Based on Prior Codebook

Xiangyang Luo, Xin Zhang, Yifan Xie, Xinyi Tong, Weijiang Yu, Heng Chang, Fei Ma, Fei Yu

Published: 20 Jul 2024, Last Modified: 16 May 2025MM2024 PosterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Face swapping, the technique of transferring the identity from one face to another, merges as a field with significant practical applications. However, previous swapping methods often result in visible artifacts. To address this issue, in our paper, we propose $CodeSwap$, a symmetrical framework to achieve face swapping with high-fidelity and realism. Specifically, our method firstly utilizes a codebook that captures the knowledge of high quality facial features. Building on this foundation, the face swapping is then converted into the code manipulation task in a code space. To achieve this, we design a Transformer-based architecture to update each code independently, which enable more precise manipulations. Furthermore, we incorporate a mask generator to achieve seamless blending of the generated face with the background of target image. A distinctive characteristic of our method is its symmetrical approach to processing both target and source images, simultaneously extracting information from each to improve the quality of face swapping. This symmetry also simplifies the bidirectional exchange of faces in a singular operation. Through extensive experiments on ClelebA-HQ and FF++, our method is proven to not only achieve efficient identity transfer but also substantially reduce the visible artifacts.

Primary Subject Area: [Generation] Generative Multimedia

Secondary Subject Area: [Experience] Multimedia Applications

Relevance To Conference: Our paper introduces an innovative face-swapping technique, which represents a novel approach to media content editing that coincides with the conference's focus on cutting-edge multimedia technologies. In addition, our approach involves the integration and generation of information from multiple sources. In this process, realistic facial features need to be synthesized from different data sources, which is conceptually consistent with multimodal information processing.

Supplementary Material: zip

Submission Number: 2660

Loading