CBTDialog: A CBT-Oriented Multi-Turn Counseling Dialogue Dataset for Mental Health Support

ACL ARR 2026 January Submission9984 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Counseling Dialogue Dataset, Controlled Text Generation, Cognitive Behavioral Therapy
Abstract: Amid persistent shortages in mental health services, large language models (LLMs) have emerged as promising tools for counseling support. However, training reliable counselor models requires high-quality data with explicit therapeutic frameworks, whereas existing LLM-synthesized datasets often lack authenticity and professional intervention annotations, limiting controllable, framework-aligned generation. To address these challenges, we construct CBTDialog, a multi-turn counseling dialogue dataset focused on Cognitive Behavioral Therapy (CBT), consisting of real-world and simulated-client sessions with over 81k counselor–client utterances. Grounded in CBT authoritative assessment tool and textbook, we provide a hierarchical intervention annotation schema comprising goal-level CBT skills and implementation-level dialogue strategies. Moreover, we propose CBT-Qwen3, a counselor model trained on CBTDialog that leverages reinforcement learning to explicitly guide and constrain the generation process under CBT intervention. Experiments demonstrate the effectiveness of our proposed model.
Paper Type: Long
Research Area: Human-AI Interaction/Cooperation and Human-Centric NLP
Research Area Keywords: Human-Centered NLP
Contribution Types: NLP engineering experiment, Publicly available software and/or pre-trained models, Data resources
Languages Studied: English
Submission Number: 9984
Loading