Abstract: Face anti-spoofing (FAS) is an indispensable step in face recognition systems. In order to distinguish spoofing faces from genuine ones, existing methods always require sophisticated handcrafted features or well-designed supervised networks to learn discriminative representation. In this paper, a novel generative self-supervised learning inspired FAS approach is proposed, which has three merits: no need for massive labeled images, excellent discriminative ability, and the learned features have good transferability. Firstly, in the pretext task, the masked image modeling strategy is exploited to learn general fine-grained features via image patches reconstruction in an unsupervised encoder-decoder structure. Secondly, the encoder knowledge is transferred into the downstream FAS task. Finally, the entire network parameters are fine-tuned using only binary labels. Extensive experiments on three standard benchmarks demonstrate that our method can be exceedingly close to the state-of-the-art in FAS, which indicates that masked image modeling is able to learn discriminative face detail features that are beneficial to FAS.
Loading