The NUS & NWPU system for Voice Conversion Challenge 2020

Published: 01 Jan 2020, Last Modified: 17 Apr 2025Blizzard Challenge / Voice Conversion Challenge 2020EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: This paper presents the NUS & NWPU voice conversion system for Voice Conversion Challenge 2020. Our submission is a Phonetic PosteriorGram (PPG) based voice conversion system, which consists of three modules, including PPG extractor, feature conversion and converted speech signal generation modules. Firstly, a PPG extractor is adopted to extract the speaker independent content features from a speech signal. Then, anencoder-decoder based feature conversion model is used to predict the converted features with the PPG inputs. Finally, a multiband WaveRNN is utilized to generate the time-domain speech signal from the converted features. The same implementation is used for both intra-lingual and cross-lingual voice conversion tasks. Evaluation results demonstrated the effectiveness of our proposed system.
Loading