Preference Learning Unlocks LLMs’ Psycho-Counseling Skills

ACL ARR 2026 January Submission1640 Authors

30 Dec 2025 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Model, Preference Learning, Psycho-Counseling
Abstract: Applying large language models (LLMs) to assist in psycho-counseling is an emerging and meaningful approach, driven by the significant gap between patient needs and the availability of mental health support. However, current LLMs struggle to consistently provide effective responses to client speeches, largely due to the lack of supervision from high-quality real psycho-counseling data, whose content is typically inaccessible due to client privacy concerns. Furthermore, the quality of therapists’ responses in available sessions can vary significantly based on their professional training and experience. Assessing the quality of therapists’ responses remains an open challenge. We address these challenges by first proposing a set of professional and comprehensive principles to evaluate therapists’ responses to client speeches. Using these principles, we create a **Psy**cho-**Co**unseling **Pref**erence dataset, **PsyCoPref**, which contains 36k high-quality preference comparison pairs. This dataset aligns with the preferences of professional psychotherapists, providing a robust foundation for evaluating and improving LLMs in psycho-counseling. Experiments on reward modeling and preference learning demonstrate that PsyCoPref is an excellent resource for LLMs to acquire essential skills for responding to clients in a counseling session. Our best-aligned model achieves an impressive win rate of 87% against GPT-4o.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: healthcare applications, clinical NLP
Contribution Types: Model analysis & interpretability, Publicly available software and/or pre-trained models, Data resources, Data analysis
Languages Studied: English
Submission Number: 1640
Loading