EMPATHYSCORE: A Reference-free Metric for Emotional Support Conversation via Knowledge Distillation

ACL ARR 2026 January Submission10884 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Emotional Support Conversation, Automatic Empathy Scoring
Abstract: Existing metrics for Emotional Support Conversation (ESC) struggle with the ``one-to-many'' nature of open-ended empathy or suffer from the high computational cost of LLM-based evaluation. To address this, we introduce \textsc{EmpathyScore}, a reference-free, lightweight metric developed via the \textsc{Distill-ES} framework. Crucially, we construct \textsc{Empathy-Eval}, a comprehensive distillation dataset containing fine-grained teacher annotations and pairwise preferences derived from rubric-guided LLM prompting. Leveraging this data, we distill the sophisticated reasoning of a teacher LLM into compact student scorers using a hybrid regression-ranking objective, decoupling evaluation into \textit{Cognitive Suitability} and \textit{Affective Resonance}. Experiments demonstrate that \textsc{EmpathyScore} achieves state-of-the-art correlation with human judgments while being orders of magnitude more efficient than LLM judges.
Paper Type: Long
Research Area: Dialogue and Interactive Systems
Research Area Keywords: evaluation and metrics
Contribution Types: Model analysis & interpretability, Data resources, Data analysis, Position papers
Languages Studied: English
Submission Number: 10884
Loading