Semantic Interaction Fusion Framework for Multimodal Sentiment Recognition

Published: 01 Jan 2023, Last Modified: 06 Jun 2025SMC 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Multimodal sentiment recognition has gained considerable attention for its relevance to various applications. To improve performance, it is critical to extract semantic information and fuse multimodal features. However, most current methods either emphasize single-modal semantic extraction and representation or lack semantic integration at a deep level. In this paper, we propose a Semantic Interaction Fusion Framework (SIFF) extracting the semantic information that evokes the specific sentiment from multiple modalities and integrating multimodal semantic information using a gate attention fusion module. The gate attention fusion module fuses multimodal semantic information adaptively, eliminating the influence of conflicting information and strengthening the emotional cues interaction between multiple modalities. We perform experiments on two benchmark datasets, CMU-MOSI and CMU-MOSEI. Our proposed method on both datasets achieves the accuracy of 87.2% and 86.5%, respectively, which is a 1.1 % and 1.8% absolute improvement over the current state-of-the-art.
Loading