Abstract: Despite growing interest in emotion cause analysis in conversations, existing research is limited by the small scale of available datasets, with the largest benchmarks containing only around 1,000 conversations. This inadequacy poses significant challenges for training effective models and conducting reliable evaluations. Moreover, traditional benchmarks' definition of emotion causes as singular, continuous spans fails to capture the complex nature of conversational emotions, where causes are often scattered across multiple utterances. To address these limitations, we construct the Emotion Cause of DailyDialog (ECDaily), a large-scale dataset containing 13,118 conversations and 102,970 utterances - ten times larger than existing ones. ECDaily uniquely incorporates both individual and aggregated cause span annotations. In addition to the Individual-ECE and Individual-ECPE tasks, we introduce two new tasks - Aggregated-ECE and Aggregated-ECPE - along with a two-stage approach for handling multiple-span causes. We establish five baseline systems using several pre-trained language models for both individual and aggregated tasks. Extensive experiments demonstrate the effectiveness of the baselines trained on ECDaily across multiple tasks, and indicate that ECDaily serves as a robust and comprehensive benchmark for advancing emotion cause analysis in conversations.
External IDs:dblp:journals/taffco/ShenLADX25
Loading