Decoupling and Reconstructing: A Multimodal Sentiment Analysis Framework Towards Robustness

Published: 2025, Last Modified: 25 Jan 2026IJCAI 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Multimodal sentiment analysis (MSA) has shown promising results but often poses significant challenges in real-world applications due to its dependence on the complete and aligned multimodal sequences. While existing approaches attempt to address missing modalities through feature reconstruction, they often neglect the complex interplay between homogeneous and heterogeneous relationships in multimodal features. To address this problem, we propose Decoupled-Adaptive Reconstruction (DAR), a novel framework that explicitly addresses these limitations through two key components: (1) a mutual information-based decoupling module that decomposes features into common and independent representations, and (2) a reconstruction module that independently processes these decoupled features before fusion for downstream tasks. Extensive experiments on two benchmark datasets demonstrate that DAR significantly outperforms existing methods in both modality reconstruction and sentiment analysis tasks, particularly in scenarios with missing or unaligned modalities. Our results show improvements of 2.21% in bi-classification accuracy and 3.9% in regression error compared to state-of-the-art baselines on the MOSEI dataset.
Loading