CLESSR-VC: Contrastive learning enhanced self-supervised representations for one-shot voice conversion

Published: 01 Jan 2024, Last Modified: 11 Apr 2025Speech Commun. 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Highlights•SSL features are adopted to ensure the model’s generalization.•Contrastive learning strategy enhances the representation ability of features.•Mel-spectrogram is introduced to compensate the speaker information in SSL features.•The proposed model has excellent performance in objective and subjective evaluations.
Loading