Multilingual Model and Data Resources for Text-To-Speech in Ugandan LanguagesDownload PDF

Published: 03 Mar 2023, Last Modified: 21 Apr 2023AfricaNLP 2023Readers: Everyone
Keywords: Text-to-speech, speech interfaces, datasets
TL;DR: We announce a new public dataset and deployment-grade models for text-to-speech in Uganda.
Abstract: We present new resources for text-to-speech in Ugandan languages. Studio-grade recordings in Luganda and English were captured for 2,413 and 2,437 utterances respectively (totaling 4,850 utterances representing 5 hours of speech). We show that this is sufficient to train high-quality TTS models which can generate natural sounding speech in either language or combinations of both with code switching. We also present results on training TTS in Luganda using crowdsourced recordings from Common Voice. Additional data collection is currently underway for the Acholi, Ateso, Lugbara and Runyankole languages. The data we describe is an extension to the SALT dataset, which already contains multi-way parallel translated text in six languages. The dataset and models described are publicly available at https://github.com/SunbirdAI/salt.
0 Replies

Loading