Abstract: Document-grounded goal-oriented dialogue systems are designed to respond to user queries by leveraging relevant external information.
Previous studies have mainly focused on handling free-form documents, often overlooking structured data such as list items, which can represent a range of nuanced semantic relations. Motivated by the observation that even advanced language models like GPT-3.5 often miss semantic cues from lists, this paper aims to enhance dialogue systems for better interpretation and use of structured lists. To this end, we introduce the List2Dial dataset, a novel benchmark to evaluate the ability of dialogue systems to respond effectively using list information. This dataset is created from unlabeled customer service documents using language models and model-based filtering processes to enhance data quality, and can be used both to fine-tune and evaluate dialogue models. Apart from directly generating responses through fine-tuning models, we further investigate the explicit use of Intermediate Steps for List (ISL) information, including list types and alignment with user background, which better reflects how humans assess list items before formulating responses. Our experimental results demonstrate that models trained on List2Dial with our ISL approach outperform baselines across various metrics. Specifically, our fine-tuned Flan-T5-XL model shows increases of 3.1% in ROUGE-L, 4.6% in correctness, 4.5% in faithfulness, and 20.6% in completeness compared to models without applying filtering and the proposed ISL method. We make our source code and dataset publicly available.
Paper Type: Long
Research Area: Dialogue and Interactive Systems
Research Area Keywords: knowledge augmented, applications, grounded dialog
Contribution Types: NLP engineering experiment, Data resources
Languages Studied: english
Submission Number: 1769
Loading