Why Music Moves Us: A Computational Model of Aesthetic Experience and Creativity via Meta-Learned Active Inference
Keywords: Computational Aesthetics, Active Inference, Meta-Learning, Music Generation, Free Energy Principle, Computational Creativity, Few-Shot Learning
TL;DR: We propose and validate a theory defining aesthetic pleasure as the rate of learning, using a new AI (APML) that achieves SOTA in few-shot music generation via meta-learned active inference.
Abstract: This paper directly confronts the fundamental aesthetic question, "Why is music beautiful?", by proposing a computable and unified theoretical framework. Grounded in Active Inference and the Free Energy Principle, we formalize aesthetic pleasure as the rate at which a generative model successfully reduces its prediction cost (variational free energy). This principle offers a computational answer to why a song by Taylor Swift can be profoundly pleasing, while a monotonous bell is boring and chaotic noise is aversive. We posit that aesthetically pleasing music maximizes this rate of free energy reduction, creating a dynamic akin to a rapid descent down a smooth slide. In contrast, monotony represents a flat plane with no descent, and chaos a rugged path where no progress can be made. To operationalize this theory, we propose that the formation of musical taste is a meta-learning process, for which this "aesthetic pleasure" serves as the intrinsic learning signal. Based on this, we design and implement the Aesthetic Priors Meta-Learner (APML), a novel dual-core generative engine. APML's decoupled design, featuring a large-scale knowledge backbone and a lightweight aesthetic core, achieves an AI that, for the first time, not only possesses musical knowledge but also makes intrinsic aesthetic judgments. For rigorous evaluation, we constructed the first meta-learning benchmark for few-shot music style transfer. Experimental results show that APML achieves state-of-the-art performance on core challenges of this task, particularly in stylistic consistency and musicality, while also demonstrating unprecedented alignment with our proposed theory-driven metrics (e.g., the rate of free energy reduction). This provides powerful empirical support for the validity of our theory, showing that optimizing for the dynamics of learning itself leads to more aesthetically aligned and adaptive generative agents.
Primary Area: applications to neuroscience & cognitive science
Submission Number: 11435
Loading