EMOCAP: Deep Dive in Systematic Assessment of Large Language Models in Emotional Intelligence through Multi-Turn Conversations

EMOCAP: Deep Dive in Systematic Assessment of Large Language Models in Emotional Intelligence through Multi-Turn Conversations

ACL ARR 2025 February Submission6497 Authors

16 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large Language Models (LLMs) often lack robust emotional intelligence, limiting their effectiveness in sensitive domains such as mental health and crisis response. Existing open-source LLMs struggle to track nuanced emotions over multi-turn dialogues, resulting in shallow or misaligned responses. Proprietary models show promise but remain closed-source, hindering transparent evaluation and improvement. To address these limitations, we propose EMOCAP, a comprehensive Emotional Intelligence framework that integrates well-established psychological frameworks (e.g., Ekman, Plutchik, Russell, Goleman, Affective Domain in Blooms Taxonomy etc.) for enhanced emotion detection, contextual adaptation, and ethical alignment. We develop a multi-turn, domain-general dataset and evaluation protocol to test how LLMs manage evolving emotions, mixed affective states, and subtle cues. Our experiments compare baseline open-source LLMs (Gemma-2-9b, Qwen2.5-7b and Llama-3-8B) against its instruction fine tuned versions (Gemma-2-9b-It, Qwen2.5-7b-It and Llama-3-8B-It) .Models incorporating the recognition and response guidelines well demonstrate better emotional tracking, reduced repetitive responses, and more ethically aligned output compared to standard baselines, although complex scenarios (e.g., sarcasm) remain challenging. By providing an open-source taxonomy and benchmark for emotional intelligence, this work lays the groundwork for empathetic, context-aware, and ethically responsible LLMs across various real-world applications.

Paper Type: Long

Research Area: Sentiment Analysis, Stylistic Analysis, and Argument Mining

Research Area Keywords: Computational Social Science and Cultural Analytics, Dialogue and Interactive Systems, Resource and Evaluation

Contribution Types: Model analysis & interpretability, Data resources, Data analysis

Languages Studied: English

Submission Number: 6497

Loading