The Tacotron2-based IPA-to-Speech speech synthesis system

Published: 01 Jan 2023, Last Modified: 01 Oct 2024SPML 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: To help language learners better understanding the pronunciation of one language, in this paper, we proposed an IPA-to-Speech speech synthesis system which aims to generate high quality human speech from written language in IPA format. There are mainly two parts in our system: a Transformer-based G2P converter and a Tacotron2-based speech synthesis system. The purpose of the G2P converter is to build the training data, all the English sentences in LJSpeech can be converted into their IPA formats by this converter, and the speech synthesis module intend to generate the speech from IPA sentences. The word error rate and phoneme error rate were utilized to evaluate the G2P converter and the mean opinion score was used to evaluate the performance of the speech synthesis. Also, this work inspired us to use the IPA format represent the dialects, in the future work, we will continue this research on the dialect recognition and generation.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview