Efficient Audio Deepfake Detection using WavLM with Early Exiting

Published: 01 Jan 2024, Last Modified: 07 Oct 2025WIFS 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The rapid development of audio generation techniques has made it increasingly easy to create sophisticated audio deepfakes, posing significant threats to individual privacy and security. To combat this, effective methods for detecting audio deepfakes are crucial. Deep learning techniques, particularly pre-trained self-supervised speech representations, such as WavLM, have shown promise in addressing this challenge. However, their computational inefficiencies limit their deployment in real-world edge speech applications. This paper investigates the use of early exiting applied to WavLM, to achieve reliable classification of genuine and spoofed audio samples with a 50% reduction in the number of model parameters and a relative performance improvement of up to 12%. The proposed method is a powerful candidate for deepfake detection in edge applications with limited computational resources.
Loading