CLESSR-VC: Contrastive learning enhanced self-supervised representations for one-shot voice conversion
Abstract: Highlights•SSL features are adopted to ensure the model’s generalization.•Contrastive learning strategy enhances the representation ability of features.•Mel-spectrogram is introduced to compensate the speaker information in SSL features.•The proposed model has excellent performance in objective and subjective evaluations.
Loading