MLADDC: Multi-Lingual Audio Deepfake Detection Corpus

Published: 10 Oct 2024, Last Modified: 29 Oct 2024Audio Imagination: NeurIPS 2024 WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Audio Deepfake Detection, GANs, Multi-Lingual Deepfakes, Half-Truth.
TL;DR: We propose an audio deepfake dataset where, fake data has been generated by HiFi-GAN, and BigVGAN. The dataset also consist of partially fake audios, thereby proposing total of 1125+ hours of data.
Abstract: This study develop Multi-Lingual Audio Deepfake Detection Corpus (MLADDC) to boost the ADD research. Existing datasets suffer from several limitations, in particular, they are limited to one or two languages. Proposed dataset contains 20 languages, which have been released in 4 Tracks (6 - Indian languages, 14 - International languages, 20 languages half-truth data, and combined data). Moreover, the proposed dataset has 400K files (1,125+ hours) of data, which makes it one of the largest datasets. Deepfakes in MLADDC have been produced using advanced DL methods, such as HiFiGAN and BigVGAN. Another novelty lies in its sub-dataset, that has partial deepfakes (Half-Truth). We compared our dataset with various existing datasets, using cross-database method. For comparison, we also proposed baseline accuracy of 68.44 %, and EER of 40.9 % with MFCC features and CNN classifier (14 languages track only) indicating technological challenges associated with ADD task on proposed dataset.
Submission Number: 53
Loading