SAFE: Sequential Attentive Face Embedding with Contrastive Learning for Deepfake Video Detection

Published: 20 Oct 2023, Last Modified: 17 Oct 2025CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge ManagementEveryoneCC BY-NC-ND 4.0
Abstract: The emergence of hyper-realistic deepfake videos has raised signif- icant concerns regarding their potential misuse. However, prior re- search on deepfake detection has primarily focused on image-based approaches, with little emphasis on video. With the advancement of generation techniques enabling intricate and dynamic manip- ulation of entire faces as well as specific facial components in a video sequence, capturing dynamic changes in both global and local facial features becomes crucial in detecting deepfake videos. This paper proposes a novel sequential attentive face embedding, SAFE, that can capture facial dynamics in a deepfake video. The proposed SAFE can effectively integrate global and local dynamics of facial features revealed in a video sequence using contrastive learning. Through a comprehensive comparison with the state-of-the-art methods on the DFDC (Deepfake Detection Challenge) dataset and the FaceForensic++ benchmark, we show that our model achieves the highest accuracy in detecting deepfake videos on both datasets.
Loading