Temporal Diversified Self-Contrastive Learning for Generalized Face Forgery Detection

Published: 2024, Last Modified: 07 Jan 2026IEEE Trans. Circuits Syst. Video Technol. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Face forgery detection receives widespread attention due to the great security threats arising from the development of face forgery technologies. Most existing works define it as a binary classification problem by modeling the spatial and temporal artifacts to distinguish real and fake videos. However, the detector tends to heavily rely on the binary labels and overfit method-specific forgery patterns of the training set, resulting in limited generalization ability. To mitigate this issue, we propose a Temporal Diversified Self-Contrastive Learning (TDSCL) framework, which guides the model to exploit generalized temporal inconsistencies for face forgery detection. Firstly, a Temporally Diversified Transformation (TDT) strategy is designed to create diverse training samples with multiple temporal scales. Subsequently, Short-term Self-contrastive Learning (STSC) and Long-term Self-contrastive Learning (LTSC) are proposed to perform temporal representations of the video at different temporal granularities to capture intrinsic and generalized forensics clues to expose fake videos, which can serve as auxiliary supervisions equipped with different backbones flexibly. Moreover, a Similarity-Guided Adaptive Fusion (SGAF) module is designed to adaptively reinforce the temporal inconsistencies for reliable classification. Extensive experiments verify that the proposed method achieves superior generalization ability over various state-of-the-art methods in different benchmark datasets.
Loading