An Evaluation Framework for Emotional Companionship Capability of Dialogue Systems

An Evaluation Framework for Emotional Companionship Capability of Dialogue Systems

ACL ARR 2026 January Submission10390 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: PQAEF, ECDBench1.0, Emotional Companionship Dialogue Systems, Evaluation Benchmark

Abstract: With the rapid development of Large Language Models, dialogue systems are shifting from information tools to emotional companions, heralding the era of Emotional Companionship Dialogue Systems (ECDs) that provide personalized emotional support for users. However, the field lacks systematic evaluation standards. To address this, we pioneered the design and implementation of the Four-Dimensional Capability Evaluation Framework (FDAEF), which hierarchically integrates ``Capability Layer → Task Layer (three-level) → Data Layer → Method Layer''. Then, we present ECDBench 1.0 , the inaugural ECD-specific benchmark developed under FDAEF. Through extensive evaluations of 30 mainstream models, we demonstrate that ECDBench 1.0 has excellent discriminant validity and can effectively quantify the differences in emotional companionship capabilities among models. Furthermore, the results reveal current models' shortcomings in deep emotional companionship, guiding future technological optimization and significantly aiding developers in enhancing ECDs’ user experience.

Paper Type: Long

Research Area: Dialogue and Interactive Systems

Research Area Keywords: Emotional Companionship Dialogue Systems (ECDs), Affective Computing, LLMs, Agent

Contribution Types: Reproduction study, Publicly available software and/or pre-trained models, Data resources, Data analysis, Position papers

Languages Studied: English, Chinese

Submission Number: 10390

Loading