Keywords: Music Information Retrieval, Graph Neural Networks, AI and Arts, Creative AI, Other
TL;DR: Music Analysis with Graph Neural Networks
Abstract: In recent years, the intersection of artificial intelligence and music informatics has gained significant traction. However, music analysis on symbolic music has not been explored to its full extent. This work investigates the application of Graph Neural Networks (GNNs) to diverse music analysis tasks on digitized classical music scores.
Music analysis, as a scholarly field and as a set of techniques, is crucial for comprehending and appreciating music. It systematically examines elements such as harmony, melody, rhythm, form, and instrumentation, revealing the interplay of compositional techniques and structural elements/patterns. Analyzing music highlights foundational elements essential for creating new compositions.
This thesis examines the effectiveness of Graph Neural Networks (GNNs) on music analysis tasks such as Cadence Detection, Roman Numeral Analysis, Composer Classification, and Voice Separation, focusing on symbolic representations (i.e., musical scores given in some machine-readable encoding). We argue that a graph structure is a more natural representation for modeling a musical score than the feature- or token-based representations that have been used so far.
Our study begins by detailing the intricacies of symbolic music representations and the limitations of existing methods. We explore graph representations for music, enabling graph learning models. The graph emerges as a natural representation that encompasses the mixed hierarchical and sequential nature of a musical score. We propose a new graph model for the score where vertices represent notes and edges capture the relations between them.
In a first step, we experimentally compare graph representations to other modeling approaches such as piano rolls, note arrays, or custom descriptors on a number of different music classification tasks. Next, we design GNN-based machine learning models specifically for music analysis, capable of addressing the unique challenges posed by music data and the targeted application. As a result, we obtain graph-based models that demonstrate improved performance on benchmark datasets for diverse analysis tasks. Furthermore, we develop a new generic graph convolution block based on perception-inspired principles that further improve performance on music understanding tasks. We present a framework for deriving and visualizing explanations of the decisions made by our music-related GNNs. Finally, we develop and publish a dedicated library for symbolic music graph processing in order to reinforce the impact of this work in the research community.
Current AI trends suggest that Large Language Models (LLMs) and transformers have the potential to solve a wide range of tasks in various fields. However, our findings indicate that graphs can be more efficient for music analysis tasks. Thus, the ultimate goal of this thesis is to establish graph-based modeling as a standard approach to computational symbolic music analysis.\
Emmanouil Karystinaios's PhD Thesis at Johannes Kepler University, submitted in October 2024, at Linz, Austria
Submission Number: 149
Loading