Unifying Symbolic Music Arrangement: Track-Aware Reconstruction and Structured Tokenization

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY-SA 4.0
Keywords: music arrangement, symbolic music, conditional music generation, symbolic music tokenization
TL;DR: A unified framework for multitrack music arrangement that enables a single pre-trained symbolic music model to handle diverse arrangement scenarios, and REMI-z, a compact and modeling friendly tokenization scheme for multitrack symbolic music.
Abstract: We present a unified framework for automatic multitrack music arrangement that enables a single pre-trained symbolic music model to handle diverse arrangement scenarios, including reinterpretation, simplification, and additive generation. At its core is a segment-level reconstruction objective operating on token-level disentangled content and style, allowing for flexible any-to-any instrumentation transformations at inference time. To support track-wise modeling, we introduce REMI-z, a structured tokenization scheme for multitrack symbolic music that enhances modeling efficiency and effectiveness for both arrangement tasks and unconditional generation. Our method outperforms task-specific state-of-the-art models on representative tasks in different arrangement scenarios---band arrangement, piano reduction, and drum arrangement, in both objective metrics and perceptual evaluations. Taken together, our framework demonstrates strong generality and suggests broader applicability in symbolic music-to-music transformation.
Supplementary Material: gz
Primary Area: Applications (e.g., vision, language, speech and audio, Creative AI)
Submission Number: 19229
Loading