MuPT: A Generative Symbolic Music Pretrained Transformer

ACL ARR 2024 June Submission4529 Authors

16 Jun 2024 (modified: 08 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: In this paper, we explore the application of Large Language Models (LLMs) to the pre-training of music. While the prevalent use of MIDI in music modeling is well-established, our findings suggest that LLMs are inherently more compatible with ABC Notation, which aligns more closely with their design and strengths, thereby enhancing the model's performance in musical composition. To address the challenges associated with misaligned measures from different tracks during generation, we propose the development of a $\underline{S}$ynchronized $\underline{M}$ulti-$\underline{T}$rack ABC Notation ($\textbf{SMT-ABC Notation}$), which aims to preserve coherence across multiple musical tracks. Our contributions include a series of models capable of handling up to 8192 tokens, covering 90% of the symbolic music data in our training set. Furthermore, we explore the implications of the $\underline{S}$ymbolic $\underline{M}$usic $\underline{S}$caling Law ($\textbf{SMS Law}$) on model performance. The results indicate a promising direction for future research in music generation, offering extensive resources for community-led research through our open-source contributions.
Paper Type: Long
Research Area: Language Modeling
Research Area Keywords: Music, Scaling Law, Foundation Model
Contribution Types: Publicly available software and/or pre-trained models
Languages Studied: ABC-Notation Music language
Submission Number: 4529