Subset Selection, Adaptation and Gemination for Amharic Text-to-Speech Synthesis

Elshadai Tesfaye Biru, Yishak Tofik Mohammed, David Tofu, Erica Cooper, Julia Hirschberg

25 May 2020OpenReview Archive Direct UploadReaders: Everyone

Abstract: While large TTS corpora exist for commercial sys-tems created for high-resource languages such as Man-darin, English, and Spanish, for many languages such as Amharic, which are spoken by millions of people, this is not the case. We are working with “found” data collected for other purposes (e.g. training ASR systems) or available on the web (e.g. news broadcasts, audiobooks) to produce TTS systems for low-resource languages which do not currently have expensive, commercial systems. This study describes TTS systems built for Amharic from “found” data and includes systems built from different acoustic-prosodic subsets of the data, systems built from combined high and lower quality data using adaptation, and systems which use prediction of Amharic gemination to improve naturalness as perceived by evaluators.

0 Replies