Keywords: Emotional Intelligence, Long-Context
Abstract: Large language models (LLMs) make significant progress in Emotional Intelligence (EI) and long-context understanding. However, existing benchmarks tend to overlook certain aspects of EI in long-context scenarios, especially under $\textit{realistic, practical settings}$ where interactions are lengthy, diverse, and often noisy. To move towards such realistic settings, we present $\textit{LongEmotion}$, a benchmark specifically designed for long-context EI tasks. It covers a diverse set of tasks, including $\textbf{Emotion Classification}$, $\textbf{Emotion Detection}$, $\textbf{Emotion QA}$, $\textbf{Emotion Conversation}$, $\textbf{Emotion Summary}$, and $\textbf{Emotion Expression}$. On average, the input length for these tasks reaches 8${,}$777 tokens, with long-form generation required for $\textit{Emotion Expression}$. To enhance performance under realistic constraints, we incorporate Retrieval-Augmented Generation ($\textit{RAG}$) and Collaborative Emotional Modeling ($\textit{CoEM}$), and compare them with standard prompt-based methods. Unlike conventional approaches, our $\textit{RAG}$ method leverages both the conversation context and the large language model itself as retrieval sources, avoiding reliance on external knowledge bases. The $\textit{CoEM}$ method further improves performance by decomposing the task into five stages, integrating both retrieval augmentation and limited knowledge injection. Experimental results show that both $\textit{RAG}$ and $\textit{CoEM}$ consistently enhance EI-related performance across most long-context tasks, advancing LLMs toward more $\textit{practical and real-world EI applications}$. Furthermore, we conduct a detailed case study on the performance comparison among GPT series models, the application of CoEM in each stage and its impact on task scores, and the advantages of the LongEmotion dataset in advancing EI. All of our code and datasets will be open-sourced, which can be viewed at the anonymous repository link https://anonymous.4open.science/r/anonymous-578B.
Supplementary Material: zip
Primary Area: datasets and benchmarks
Submission Number: 10412
Loading