Abstract: The proliferation of deepfakes has posed significant challenges to identity authentication and content integrity. Consequently, the field of deepfake detection has garnered considerable attention, with numerous researchers proposing detection methods to discern between genuine and fake images. In this paper, we investigate a scenario wherein deepfake detection models face the threat of backdoor attacks. Through a comprehensive evaluation of backdoor attacks and defense strategies, we provide an insightful analysis of their outcomes. Our findings reveal that defense methods for complex tasks like deepfake detection may exhibit weaknesses for three reasons. Firstly, in deepfake detection, the common practice of cropping faces from images for detection may lead to triggers being absent in the cropped images. Secondly, deepfake detection is complex, especially when triggers exist, as it demands the model to learn subtle features. Lastly, the number of classes in a classification task is crucial for developing defenses against backdoor attacks. This study advances our understanding of backdoor attacks within the context of deepfake detection, urging the development of more robust defense mechanisms.
Loading