Voice tranformation using PSOLA technique

Published: 1991, Last Modified: 25 Jan 2026EUROSPEECH 1991EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Whereas speaker normalization and adaptation has received a lot of attention for speech recognition, few studies have been devoted to voice transformation for speech synthesis despite the potential interestof such techniques. Converting voice individuality needs spectrum, glottal excitation and prosody modifications. This work focuses on spectral modifications but some easy prosodic alterations are taken into account. We combine two techniques to simulate speaker changement. The first one is the TD-PSOLA technique which is very efficient to alter prosody. The second is a classical source-filter decomposition. It extracts from the signal a spectral representation on which spectral modifications arc performed. Two approaches are suggested to transform the spectrum: the first is the well-known Linear Multivariate Regression; the second is the Dynamic Frequency Warping.
Loading