Keywords: Brain decoding, music, multimodal, neural music decoding, content retrieval
TL;DR: We built a deep learning pipeline to decode human fMRI activity into listened songs.
Abstract: Music is a universal phenomenon that profoundly influences human experiences across cultures. This study investigates whether musical tracks can be decoded from human brain activity measured with functional MRI (fMRI). Leveraging recent advancements in extensive datasets and pre-trained computational models, we constructed mappings between neural data and latent representations of musical stimuli. Our approach integrates functional and anatomical alignment techniques to facilitate cross-subject decoding, addressing the challenges posed by low temporal resolution and noise in fMRI data. We used the GTZan fMRI dataset, in which five participants listened to 540 musical tracks from 10 different genres while their brain activity was recorded. We used the CLAP (Contrastive Language-Audio Pretraining) model to extract latent representations of the musical tracks and developed voxel-wise encoding models to identify brain regions responsive to these stimuli. By applying a threshold to the correlation between predicted and actual brain activity, we identified specific regions of interest (ROIs) for music processing.
Our decoding pipeline, primarily retrieval-based, employed ridge regression to map brain activity in the identified ROIs to the corresponding CLAP features. This enabled us to predict and retrieve the most similar musical tracks from the latent space based on neural data. The results demonstrated state-of-the-art identification accuracy, with our methods significantly outperforming existing approaches. The findings highlight the potential for neural-based music retrieval systems, opening new avenues for personalized music recommendations and therapeutic applications. Future work could explore the use of higher temporal resolution neuroimaging methods and more sophisticated generative models to further enhance the decoding accuracy and explore the neural underpinnings of music perception and emotion.
Submission Number: 15
Loading