Hyper-Modality Enhancement for Multimodal Sentiment Analysis with Missing Modalities

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: multimodal sentiment analysis; incomplete learning;
TL;DR: HME tackles the challenge of missing modalities in multimodal sentiment analysis by generating hyper-modality representations, enhancing information integration and reducing reliance on complete datasets.
Abstract: Multimodal Sentiment Analysis (MSA) aims to infer human emotions by integrating complementary signals from diverse modalities. However, in real-world scenarios, missing modalities are common due to data corruption, sensor failure, or privacy concerns, which can significantly degrade model performance. To tackle this challenge, we propose Hyper-Modality Enhancement (HME), a novel framework that avoids explicit modality reconstruction by enriching each observed modality with semantically relevant cues retrieved from other samples. This cross-sample enhancement reduces reliance on fully observed data during training, making the method better suited to scenarios with inherently incomplete inputs. In addition, we introduce an uncertainty-aware fusion mechanism that adaptively balances original and enriched representations to improve robustness. Extensive experiments on three public benchmarks show that HME consistently outperforms state-of-the-art methods under various missing modality conditions, demonstrating its practicality in real-world MSA applications.
Primary Area: Applications (e.g., vision, language, speech and audio, Creative AI)
Submission Number: 4547
Loading