Hyperbolic Music Representations

ICLR 2026 Conference Submission18547 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: music generation, controllable models, hyperbolic geometry
TL;DR: Hierarchical music structures are represented in hyperbolic geometry, yielding an interpretable latent space.
Abstract: Music is inherently hierarchical due to keys and variations of note sequences. These dependencies need to be captured by the metric of choice to learn an appropriate representation space. Although Euclidean geometry is frequently used to embed music, it is clearly unable to capture the hierarchical structures. In this paper, we propose to learn hyperbolic representation spaces for music using Variational Autoencoders with a Poincaré ball as a natural alternative to Euclidean geometry. The resulting latent space is interpretable, reflects keys and musical richness, and allows for meaningful interpolations due to a novel generalization of Spherical Linear Interpolation to Riemannian manifolds. Empirically, we compare our contribution to standard Euclidean representations and observe that the latter fall short in terms of interpretation and reconstruction.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 18547
Loading