Latent Distribution Decouple for Uncertain-Aware Multimodal Multi-label Emotion Recognition

ACL ARR 2025 February Submission4018 Authors

15 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract:

Multimodal multi-label emotion recognition (MMER) aims to identify the simultaneous presence of multiple emotions in multimodal data. Existing studies primarily focus on improving fusion strategies and modeling modality-to-label dependencies. However, they often overlook the impact of $\textbf{aleatoric uncertainty}$, which arises from inherent noise in multimodal data and hinders modality fusion by introducing ambiguity into feature representations. To address this issue and effectively model aleatoric uncertainty, we propose Latent Emotional Distribution Decomposition with Uncertainty Perception (LDDU), a novel framework based on latent emotional space probabilistic modeling. Specifically, we introduce a contrastive disentangled distribution mechanism within the emotion space, enabling the extraction of both semantic features and uncertainty representations. Furthermore, we design an uncertainty-aware multimodal fusion method that accounts for the dispersed nature of uncertainty and integrates distributional information to enhance emotion representation learning. Experimental results on CMU-MOSEI and M$^3$ED demonstrate that LDDU achieves state-of-the-art performance, underscoring the importance of uncertainty modeling in MMER. The related code will be released to facilitate further research.

Paper Type: Long
Research Area: Sentiment Analysis, Stylistic Analysis, and Argument Mining
Research Area Keywords: argument mining
Languages Studied: English
Submission Number: 4018
Loading