Hierarchical Multi-Criteria Representation Fusion for Robust Incomplete Multimodal Sentiment Analysis

Yijing Dai, Yingjian Li, Jinxing Li, Guangming Lu

Published: 01 Jan 2025, Last Modified: 27 Jan 2026IEEE Transactions on Affective ComputingEveryoneRevisionsCC BY-SA 4.0

Abstract: The challenge of improving robustness to missing data in multimodal sentiment analysis (MSA) has recently attracted increasing attention. The existing work is based on sentence-level sentiment analysis, which requires extracting vector representations from input multimodal sentence sequences to perform sentiment regression or classification tasks. However, most studies focus primarily on exploring cross-modal correlations, neglecting the dynamic semantic variations across sequence frames during feature merging and fusion. In this paper, we propose the Hierarchical Multi-criteria Representation Fusion (HMRF) framework, which effectively captures dynamic semantic information across frames and modalities. Specifically, HMRF consists of two parts, i.e., sequence merging and information transfer. For sequence merging, we design a hierarchical cross-modal semantic perception mechanism and a multi-criteria feature merging mechanism, which together mitigate unimodal affective bias and progressively integrate incomplete feature sequences. Moreover, for information transfer, hierarchical information distillation and high-level semantic reconstruction are performed to transfer certain attributes and sentiment information from the complete view to the incomplete view during training. Extensive experiments on multiple benchmark datasets consistently demonstrate that the proposed HMRF outperforms existing baselines.

External IDs:doi:10.1109/taffc.2025.3613302