Transformation of spectral envelope for voice conversion based on radial basis function networks

Published: 15 Sept 2002, Last Modified: 04 May 2025ICSLP-2002EveryoneCC BY 4.0
Abstract: This paper presents a novel algorithm that modifies the speech uttered by a source speaker to sound as if produced by a target speaker. In particular, we address the issue of transformation of the vocal tract characteristics from one speaker to another. The approach is based on estimating spectral envelopes using radial basis function (RBF) networks, which is one of the well-known models of artificial neural networks. The simulation results show that the proposed method achieves nearly optimal spectral conversion performance. Moreover, average cepstrum distance to the target speech is reduced by 87%, and in the listening tests, around 84% of mean opinion score (MOS) is obtained.
Loading