EDAudio: Easy Data Augmentation for Dialectal Audio

Lea Fischbach, Akbar Karimi, Alfred Lameli, Lucie Flek

Published: 2025, Last Modified: 25 May 2026RANLP 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: We investigate lightweight and easily applicable data augmentation techniques for dialectal audio classification. We evaluate four main methods, namely shifting pitch, interval removal, background noise insertion and interval swap as well as several subvariants on recordings from 20 German dialects. Each main method is tested across multiple hyperparameter combinations, inlcuding augmentation length, coverage ratio and number of augmentations per original sample. Our results show that frequency-based techniques, particularly frequency masking, consistently yield performance improvements, while others such as time masking or speaker-based insertion can negatively affect the results. Our comparative analysis identifies which augmentations are most effective under realistic conditions, offering simple and efficient strategies to improve dialectal speech classification.