Keywords: Symbolic music generation, Transformer, Controllability, Beam Search, Inference-time control
TL;DR: We introduce a dual-level beam search that enables explicit, inference-time control of tonal tension in Transformer-based symbolic music generation.
Abstract: Large language models (LLMs) and Transformer-based architectures have achieved remarkable progress in symbolic music generation, producing outputs with increasing coherence, stylistic richness, and expressive depth. Controllability in symbolic music generation is essential for aligning outputs with compositional intent and user-specified goals. Among high-level perceptual attributes, tonal tension remains underexplored for explicit control. In this work, we present a novel approach that integrates a computational model of tonal tension into a transformer generation framework through a dual-level beam search strategy. At the token level, candidate continuations are re-ranked for probability and diversity, while at the bar level, tension similarity measures ensure alignment with a target tension curve. Preliminary evaluations indicate that this method enables explicit control of tonal tension while maintaining overall musical quality and coherence. This contributes to the broader effort of aligning LLMs with creative control, and highlights tonal tension as an underexplored but musically salient axis of controllability.
Submission Number: 22
Loading