Adaptable Symbolic Music Infilling with MIDI-RWKV

Christian Zhou-Zheng; Philippe Pasquier

Adaptable Symbolic Music Infilling with MIDI-RWKV

Christian Zhou-Zheng, Philippe Pasquier

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Generative Models, Music Modeling and Analysis, Symbolic Music, Musical Infilling, State Tuning

TL;DR: We introduce a competitive small musical infilling model and a finetuning method for it that empirically outperforms LoRA on small datasets.

Abstract: Existing work in automatic music generation has mostly focused on end-to-end systems that generate either entire compositions or continuations of pieces, which are difficult for composers to iterate on. The area of computer-assisted composition, where generative models integrate into existing creative workflows, remains comparatively underexplored. In this study, we address the tasks of model style adaptation and multi-track, long-context, and controllable symbolic music infilling to enhance the process of computer-assisted composition. We present MIDI-RWKV, a small foundation model based on the RWKV-7 linear architecture, to enable efficient and coherent musical cocreation on edge devices. We also demonstrate that MIDI-RWKV admits an effective method of finetuning its initial state for style adaptation in the very-low-sample regime. We evaluate MIDI-RWKV and its state tuning on several quantitative and qualitative metrics with respect to existing models, and release model weights and code in the supplementary materials.

Supplementary Material: zip

Primary Area: applications to computer vision, audio, language, and other modalities

Submission Number: 13996

Loading