Atom decomposition-based intonation modelling

Pierre-Edouard Honnet, Branislav Gerazov, Philip N. Garner

2015 (modified: 20 Sept 2021)ICASSP 2015Readers: Everyone

Abstract: Current statistical parametric text-to-speech (TTS) synthesis methods allow production of neutral speech with acceptable quality. However, prosody is often qualified as unsatisfactory and sounding too flat. In this paper, we address intonation modelling for TTS based on physiological aspects of prosody production. A set of gamma distribution shaped atoms is defined and then intonation decomposition is performed using a matching pursuit algorithm. Some preliminary experiments show that this model allows easy extraction of physiologically meaningful atoms that could be used to generate intonation in a TTS system.

0 Replies