Keywords: Applications of interpretability, Concept Discovery (e.g., SAEs, dictionary learning)
TL;DR: We introduce DC-SAE, a dual-contrastive sparse autoencoder that factors MusicGen activations from musical performances into work-identity and performance-variation branches, yielding interpretable and steerable features of musical interpretation.
Abstract: For decades, philosophers and musicologists have debated which features of a performance are constitutive of the work and which express how it is being interpreted. Computational evidence has been hard to assemble: audio embedding models conflate work identity with performance style, and existing interpretability tools for music generators recover flat dictionaries that mix the two. We address this gap with the dual-contrastive sparse autoencoder (DC-SAE), a two-branch sparse autoencoder that uses coarse work-level metadata to factor a frozen MusicGen transformer's residual stream into work-identity and performance-variation subspaces. Across classical-work, jazz-standard, and pop-cover corpora, the resulting decomposition is supported by probes and feature galleries that surface musically interpretable concepts on each side. Without performer supervision, the variation branch acquires structure related to performer identity, and steering along these directions can shift the perceived performer of generated audio while preserving the underlying work. Together, these results show that MusicGen, a generative music transformer, contains a model-internal representation of musical interpretation that can be recovered, interpreted, and steered.
Submission Number: 534
Loading