Chord-Transformer:Chord-Progression Guided Transformer for Long-Sequence Symbolic Music Generation

18 Sept 2025 (modified: 26 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Chord Progression Extraction; Chord-Aligned Positional Encoding; Cross-Attention Fusion
Abstract: Transformer-based symbolic music generation models are increasingly becoming a vital approach for music composition and editing. Current music generation models face a main challenge in lacking effective structural control mechanisms, making it difficult to maintain harmonic coherence and structural integrity in generated music. This paper presents the Chord-Transformer architecture, which uses chord progression sequences as high-level semantic features to guide the music generation process. Our approach employs an energy-based dynamic programming algorithm to extract chord progressions from the input data. These progressions are used as structural constraints, integrated with a Transformer architecture, to enable autoregressive chord-to-music generation. To enhance the model's ability to capture musical structure, we design a chord-aligned positional encoding scheme and introduce a fusion module that combines cross-attention for chord progression sequences with self-attention for music sequences. This mechanism strengthens collaborative modeling of local and global chord contexts, effectively improving harmonic consistency and structural integrity of generated music. Experimental results show that, compared to state-of-the-art baselines, our proposed method shows significant improvements in key metrics including scale consistency, polyphonic quality, and user preference scores.
Supplementary Material: zip
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 11505
Loading