Abstract: Whereas speaker adaptation has received much attention for speech recognition few studies have been devoted to voice transformation for speech synthesis, despite the potential interests of such techniques. The authors propose a voice conversion system which combines the time-domain pitch synchronous overlap and add (TD-PSOLA) technique with a source-filter decomposition. The first technique allows prosodic modifications while the second enables spectral envelope transformations. Two approaches to learn spectral alteration are compared: the linear multivariate regression (LMR) and the dynamic frequency warping (DFW).<>
External IDs:dblp:conf/icassp/ValbretMT92
Loading