FOCUS: A Fine-Grained Customer-Oriented Sentiment Dialogue Summarization Dataset for Chinese Customer Service

ACL ARR 2026 January Submission1394 Authors

29 Dec 2025 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: NLP datasets; benchmarking; abstractive summarisation; factuality
Abstract: Dialogue summarization (DS) plays a vital role in improving customer service efficiency by automatically generating concise summaries from lengthy multi-turn dialogues. However, existing studies largely overlook the fine-grained sentiment dynamics expressed by customers, and most DS datasets lack detailed sentiment annotations. These limitations hinder both accurate service quality assessment and the development of sentiment-aware summarization models. To address these challenges, we propose a three-stage approach to building an aspect-aware sentiment dataset, comprising: (1) aspect-anchored dialogue rewriting, (2) dialogue-anchored explainable label generation, and (3) label-dialogue integrated summarization. Building upon this scheme, we construct FOCUS, a $\textbf{F}$ine-grained customer-$\textbf{O}$riented $\textbf{C}$hinese dialog$\textbf{U}$e $\textbf{S}$ummarization dataset. FOCUS is the first Chinese dataset with 12,948 dialogues annotated for multi-level aspects, sentiment polarity, opinion content, emotions, as well as customer-oriented formatted and free-style sentiment summaries. To demonstrate the challenges and utility of FOCUS, we benchmark a range of summarization models on FOCUS and observe that current methods often exhibit misalignment between aspects and sentiments. Meanwhile, we find that a Chain-of-Thought approach can enhance faithfulness and interpretability, highlighting promising directions for future research on this dataset. FOCUS serves as a valuable resource to advance research in sentiment-aware DS and related tasks.
Paper Type: Long
Research Area: Resources and Evaluation
Research Area Keywords: NLP datasets; benchmarking; abstractive summarisation; factuality
Contribution Types: Data resources, Data analysis
Languages Studied: English, Chinese
Submission Number: 1394
Loading