Abstract: As deepfake generation models advance, significant research has focused on the deepfake detection task; however, challenges in model generalization persist. This paper introduces a novel audio deepfake detection method that achieves robust performance by leveraging voice identity information. Employing contrastive learning with voice identity features as additional data, the proposed approach demonstrates strong adaptability, effectively detecting data generated by unseen models during training. These findings demonstrate that effectively utilizing voice identity enables audio deepfake detection algorithms to be applied in real-world scenarios without the need for additional labeling efforts. Experimental results confirm the robustness of the proposed algorithm against variations in both audio deepfake generation models and domains.
External IDs:dblp:conf/iccel/ChoiKC25
Loading