Non-parallel Accent Transfer based on Fine-grained Controllable Accent Modelling

Published: 07 Oct 2023, Last Modified: 01 Dec 2023EMNLP 2023 FindingsEveryoneRevisionsBibTeX
Submission Type: Regular Long Paper
Submission Track: Speech and Multimodality
Submission Track 2: Speech and Multimodality
Keywords: Accent Transfer,Fine-grained Controllable Accent Modelling,Non-parallel
TL;DR: The accents are modelled at a fine-grained level in terms of tone and rhythm. It also uses mutual information learning to disentangle the accent features of the speaker information and control the accent of the generated speech
Abstract: Existing accent transfer works rely on parallel data or speech recognition models. This paper focuses on the practical application of accent transfer and aims to implement accent transfer using non-parallel datasets. The study has encountered the challenge of speech representation disentanglement and modeling accents. In our accent modeling transfer framework, we manage to solve these problems by two proposed methods. First, we learn the suprasegmental information associated with tone to finely model the accents in terms of tone and rhythm. Second, we propose to use mutual information learning to disentangle the accent features and control the accent of the generated speech during the inference time. Experiments show that the proposed framework attains superior performance to the baseline models in terms of accentedness and audio quality.
Submission Number: 2281
Loading