EmotionTalk: An Interactive Chinese Multimodal Emotion Dataset With Rich Annotations

ICLR 2026 Conference Submission18310 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: dialogue; speech captioning; Multimodal emotion recognition
Abstract: In recent years, emotion recognition has played an increasingly crucial role in applications such as human-computer interaction, mental health monitoring, and sentiment analysis. Although a large number of sentiment analysis datasets have emerged for mainstream languages such as English, high-quality and naturally recorded multimodal dialogue datasets remain extremely scarce for Chinese, given its unique linguistic characteristics, rich cultural connotations, and complex multimodal interaction features. In this work, we propose EmotionTalk, an interactive Chinese multimodal emotion dataset with rich annotations. This dataset provides multimodal information from 19 actors participating in dyadic conversational settings, incorporating acoustic, visual, and textual modalities. It includes 23.6 hours of speech (19,250 utterances), annotations for 7 utterance-level emotion categories (happy, surprise, sad, disgust, anger, fear, and neutral), 5-dimensional sentiment labels (negative, weakly negative, neutral, weakly positive, and positive) and 4-dimensional speech captions (speaker, speaking style, emotion and overall). The dataset is well-suited for research on unimodal and multimodal emotion recognition, missing modality challenges, and speech captioning tasks. To our knowledge, it represents the first high-quality and versatile Chinese dialogue multimodal emotion dataset, which is a valuable contribution to research on cross-cultural emotion analysis and recognition. Additionally, we conduct experiments on EmotionTalk to demonstrate the effectiveness and quality of the dataset. The EmotionTalk dataset will be made freely available for all academic purposes.
Primary Area: datasets and benchmarks
Submission Number: 18310
Loading