Neural Text-to-Speech Synthesis for VõroDownload PDF

Published: 20 Mar 2023, Last Modified: 18 Apr 2023NoDaLiDa 2023Readers: Everyone
Abstract: This paper presents the first high-quality neural text-to-speech (TTS) system for Võro, a minority language spoken in Southern Estonia. By leveraging existing Estonian TTS models and datasets, we analyze whether common low-resource NLP techniques, such as cross-lingual transfer learning from related languages or multi-task learning, can benefit our low-resource use case. Our results show that we can achieve high-quality Võro TTS without transfer learning and that using more diverse training data can even decrease synthesis quality. While these techniques may still be useful in some cases, our work highlights the need for caution when applied in specific low-resource scenarios, and it can provide valuable insights for future low-resource research and efforts in preserving minority languages.
Student Paper: Yes, the first author is a student
3 Replies

Loading