Face Forgery Detection Based on Fine-Grained Clues and Noise Inconsistency

Published: 01 Jan 2025, Last Modified: 19 Apr 2025IEEE Trans. Artif. Intell. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Deepfake detection has gained increasing research attention in media forensics, and a variety of works have been produced. However, subtle artifacts might be eliminated by compression, and the convolutional neural networks (CNNs)-based detectors are invalidated for fake face images with compression. In this work, we propose a two-stream network for deepfake detection. We observed that high-frequency noise features and spatial features are inherently complementary to each other. Thus, both spatial features and high-frequency noise features are exploited for face forgery detection. Specifically, we design a double-frequency transformer module (DFTM) to guide the learning of spatial features from local artifact regions. To effectively fuse spatial features and high-frequency noise features, a dual-domain attention fusion module (DDAFM) is designed. We also introduce a local relationship constraint loss, which requires only image-level labels, for model training. We evaluate the proposed approach on five large-scale benchmark datasets, and extensive experimental results demonstrate the proposed approach outperforms most SOTA works.
Loading