ToneCraft: Cantonese Lyrics Generation with Harmony of Tones and Pitches

ACL ARR 2025 May Submission1059 Authors

16 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Lyrics generation has garnered increasing attention within the artificial intelligence community. Our task focuses on generating harmonious Cantonese lyrics. Unlike other languages, Cantonese has a unique system of nine contours and six tones, making it essential to satisfy the harmony rules that ensure the alignment between the melody and the tonal contours of the lyrics when composing lyrics. Current research has not yet addressed the challenge of generating lyrics that adhere to Cantonese harmony rules. To tackle this issue, we propose ToneCraft, a novel framework for generating Cantonese lyrics that ensures tonal and melodic harmony. It enables LLMs to generate lyrics with a fixed character count while aligning with tonal and melodic structures. We present an algorithm that combines character-level control, melodic guidance, and a task-specific loss to achieve tonal harmony without compromising generation flexibility and quality. By incorporating domain-specific expertise, we leverage pure lyric datasets to train our model, eliminating the need for aligned data. Both objective evaluations and subjective assessments show that our generated lyrics align with melodic contours significantly better than existing methods. All code and data are available at: https://anonymous.4open.science/r/ToneCraft-DEDF.
Paper Type: Long
Research Area: Generation
Research Area Keywords: Melody-to-Lyric Generation, Harmony, Dialects and Language Varieties, Indigenous Languages
Contribution Types: Approaches to low-resource settings, Publicly available software and/or pre-trained models, Data resources, Theory
Languages Studied: Cantonese, Mandarin, English
Submission Number: 1059
Loading