Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
POLICY DRIVEN GENERATIVE ADVERSARIAL NETWORKS FOR ACCENTED SPEECH GENERATION
Prannay Khosla, Preethi Jyothi, Vinay P. Namboodiri, Mukundhan Srinivasan
Feb 15, 2018 (modified: Feb 15, 2018)ICLR 2018 Conference Blind Submissionreaders: everyoneShow Bibtex
Abstract:In this paper, we propose the generation of accented speech using generative adversarial
networks. Through this work we make two main contributions a) The
ability to condition latent representations while generating realistic speech samples
b) The ability to efficiently generate long speech samples by using a novel
latent variable transformation module that is trained using policy gradients. Previous
methods are limited in being able to generate only relatively short samples
or are not very efficient at generating long samples. The generated speech samples
are validated through a number of various evaluation measures viz, a WGAN
critic loss and through subjective scores on user evaluations against competitive
speech synthesis baselines and detailed ablation analysis of the proposed model.
The evaluations demonstrate that the model generates realistic long speech samples
conditioned on accent efficiently.