Abstract: In this work, we present a novel representation learning module for face forgery detection called attentional local contrastive learning (ALCL). ALCL is designed to distinguish forged regions from pristine regions using an explicit constraint. Specifically, feature vectors extracted by the backbone are first embedded into a unit hypersphere, and for each local feature vector, ALCL constructs horizontal and vertical triple sets respectively with its adjacent vectors. ALCL minimizes the angle between vectors of the same source while maximizing that between different sources by optimizing their normalized cosine similarity. Moreover, we also propose a multiple scale residual learning (MSRL) module that takes advantage of rich residual information to complement RGB input. We demonstrate the effectiveness of the proposed method through comprehensive experiments. On multiple challenging face forgery benchmarks, our method achieves great performances under both in-domain and cross-domain settings, and also shows good robustness to compression compared to existing works.
0 Replies
Loading