Score and Lyrics-Free Singing Voice Generation

Jen-Yu Liu; Yu-Hua Chen; Yin-Cheng Yeh; Yi-Hsuan Yang

Score and Lyrics-Free Singing Voice Generation

Jen-Yu Liu, Yu-Hua Chen, Yin-Cheng Yeh, Yi-Hsuan Yang

25 Sept 2019 (modified: 22 Jun 2025)ICLR 2020 Conference Blind SubmissionReaders: Everyone

TL;DR: Our models generate singing voices without lyrics and scores. They take accompaniment as input and output singing voices.

Abstract: Generative models for singing voice have been mostly concerned with the task of "singing voice synthesis," i.e., to produce singing voice waveforms given musical scores and text lyrics. In this work, we explore a novel yet challenging alternative: singing voice generation without pre-assigned scores and lyrics, in both training and inference time. In particular, we experiment with three different schemes: 1) free singer, where the model generates singing voices without taking any conditions; 2) accompanied singer, where the model generates singing voices over a waveform of instrumental music; and 3) solo singer, where the model improvises a chord sequence first and then uses that to generate voices. We outline the associated challenges and propose a pipeline to tackle these new tasks. This involves the development of source separation and transcription models for data preparation, adversarial networks for audio generation, and customized metrics for evaluation.

Keywords: singing voice generation, GAN, generative adversarial network

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/score-and-lyrics-free-singing-voice/code)

Original Pdf: pdf

12 Replies

Loading