Submission from vivo for Blizzard Challenge 2019

Published: 01 Jan 2019, Last Modified: 13 Nov 2024Blizzard Challenge 2019EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This paper presents the vivo speech synthesis system for Blizzard Challenge 2019. The task is to build an expressive speech synthesis system on an 8-hour corpus of a well-known Chinese talk-show character. Our system is based on Tacotron with several minor improvements, which are more clear speech energy normalization, outlier removal of problematic shorter utterances, and special phone modelling for explicit long silences, audible breath sounds, and mouth-click sounds. Evaluation results showed that the proposed system is somewhat successful.
Loading