Decoding Musical Perception: Music Stimuli Reconstruction from Brain Activity

Published: 10 Oct 2024, Last Modified: 30 Oct 2024Audio Imagination: NeurIPS 2024 WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Computational neuroscience, Brain decoding, Generative models, Music reconstruction
TL;DR: This study explores the use of generative models to decode and reconstruct music stimuli from fMRI brain data.
Abstract: This study explores the feasibility of reconstructing musical stimuli from functional MRI (fMRI) data using generative models. Specifically, we employ MusicLDM, a latent diffusion model capable of generating music from text descriptions, in order to decode musical stimuli from fMRI signals. We first identify music-responsive regions in the brain by correlating neural activity with representations derived from the CLAP (Contrastive Language-Audio Pretraining) model. We then map the fMRI data from these music-responsive regions to the latent embeddings of MusicLDM using regression models, without relying on empirical descriptions of the musical stimuli. To enhance between-subject consistency, we apply functional alignment techniques to align neural data across participants. Our evaluation, based on Identification Accuracy, achieves a high correspondence between the reconstructed embeddings and the original musical stimuli in the MusicLDM space, with an accuracy of 91.4%, surpassing previous methods. Additionally, a human evaluation experiment showed that participants were able to identify the correct decoded stimulus with an average accuracy of 84.1%, further demonstrating the perceptual similarity between the original and reconstructed music. Future work will aim to improve temporal resolution and investigate applications in music cognition.
Submission Number: 22
Loading