COSMIC: Music emotion recognition combining structure analysis and modal interactionDownload PDFOpen Website

Published: 01 Jan 2024, Last Modified: 08 Apr 2024Multim. Tools Appl. 2024Readers: Everyone
Abstract: As a common multi-modal information carrier, music is frequently used to deliver emotions with lyrics and melodies. Besides lyrics (text) and melodies (audio), the structure of a song is another indicator of emotions creating a strong resonance for listeners. Typically, a pop song is composed of verses and choruses. To improve the performance of existing music emotion recognition models, we first propose a hierarchical model to analyze music structure. Then, a cross-modal interaction method is developed to extract and interact emotions from different modalities. Finally, we perform music emotion recognition by combining music structure analysis and cross-modal interaction. Adequate experiments are conducted on a dataset crawled from Netease Cloud Music, and results demonstrate the effectiveness of music structure analysis and cross-modal interaction. The proposed model COSMIC achieves state-of-the-art performance on music emotion recognition tasks.
0 Replies

Loading