Evaluating Deepfake Speech and ASV Systems on African Accents

Kweku Andoh Yamoah; Hussein Baba Fuseini; David Ebo Adjepon-Yamoah; Dennis Asamoah Owusu

Evaluating Deepfake Speech and ASV Systems on African Accents

Kweku Andoh Yamoah, Hussein Baba Fuseini, David Ebo Adjepon-Yamoah, Dennis Asamoah Owusu

07 Jul 2023 (modified: 07 Dec 2023)DeepLearningIndaba 2023 Conference SubmissionEveryoneRevisionsBibTeX

Keywords: Automatic Speaker Verification (ASV), Deep Neural Networks(DNN), SV2TTS, Resemblyzer, Mean Opinion Score(MOS), Equal Error Rate(EER)

TL;DR: Paper examines the impact of deepfake audio with African accents on ASV systems, using SV2TTS as the synthesis model and the Resemblyzer as the ASV system.

Abstract: Automatic Speaker Verification (ASV) systems are vital for seamless authentication in digital systems using speech. However, the rise of deep neural network (DNN)-based voice synthesis has introduced the risk of deepfake audios that convincingly mimic human voices. This poses a significant threat to both individual identities and ASV system security. To address this, an extensive study examined the impact of deepfake audio with African accents on ASV systems. The findings reveal that modern ASV systems, like the Resemblyzer, are less susceptible to deception by deepfake audio with African accents. These results highlight the need for developing deepfake audio systems that accurately simulate authentic African accents, enabling effective technology utilization in addressing modern challenges in Africa.

Submission Category: Machine learning algorithms

Submission Number: 13

Loading