ManWav1.0: The First Manchu ASR Model as the Milestone to Future Low-Resource ASRDownload PDF

Anonymous

16 Dec 2023ACL ARR 2023 December Blind SubmissionReaders: Everyone
TL;DR: The very first Manchu ASR model, ManWav1.0, shows significant performance improvement when trained on augmented data compared to original data.
Abstract: This study addresses the widening gap in Automatic Speech Recognition (ASR) research between high resource and low resource languages, with a particular focus on Manchu, a severely underrepresented language. As an extremely low resource language, Manchu exemplifies the challenges faced by marginalized linguistic communities in accessing state-of- the-art technologies. In a pioneering effort, we introduce the first-ever Manchu ASR model, leveraging Wav2Vec 2.0 - XLSR. This groundbreaking development demonstrates the adaptability of advanced ASR models to bridge the gap for low resource languages. The results of the first Manchu ASR is promising, especially when our data augmentation method is employed. Wav2Vec 2.0 - XLSR fine-tuned with augmented data demonstrates a 2%p drop in CER and 13%p drop in WER compared to the same model fine-tuned with original data. This advancement not only marks a significant step in ASR research but also incorporates linguistic diversity into technological innovation.
Paper Type: short
Research Area: Speech recognition, text-to-speech and spoken language understanding
Contribution Types: Approaches to low-resource settings, Publicly available software and/or pre-trained models, Data resources
Languages Studied: Manchu
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview