Efficient-Empathy: Towards Efficient and Effective Empathetic Data Selection

Efficient-Empathy: Towards Efficient and Effective Empathetic Data Selection

ACL ARR 2025 May Submission985 Authors

16 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Empathy is a fundamental pillar of human social intelligence and a critical requirement for the development of human-centered artificial general intelligence (AGI). While large language models (LLMs) have shown remarkable general-purpose capabilities, their empathetic reasoning remains limited, largely due to the scarcity of high-quality training data. Prior work in empathetic modeling often relies on shallow emotional cues or architectural enhancements, overlooking the heterogeneous and multi-dimensional nature of empathy itself. In this work, we propose a data-efficient empathy learning framework that integrates insights from psychology—specifically, the dual dimensions of sensibility and rationality—as guiding criteria for high-quality data selection. Our approach leverages LLMs to automatically score and filter empathy dialogues, constructing curated datasets that emphasize emotionally grounded and cognitively coherent responses. We then train specialized sensibility and rationality experts, and dynamically combine their capabilities via a Mixture-of-Experts (MoE) model. Empirical results demonstrate that our framework not only achieves state-of-the-art empathetic generation but does so using significantly fewer data samples, affirming the importance of quality-driven selection in scaling empathetic AGI.

Paper Type: Long

Research Area: Linguistic theories, Cognitive Modeling and Psycholinguistics

Research Area Keywords: Empathetic Data, Data Quality, Data Selection

Contribution Types: Model analysis & interpretability, Approaches low compute settings-efficiency, Data analysis

Languages Studied: English

Keywords: Empathetic Data, Data Quality, Data Selection

Submission Number: 985

Loading