Making Task-Oriented Dialogue Datasets More Natural by Synthetically Generating Indirect User Requests

Making Task-Oriented Dialogue Datasets More Natural by Synthetically Generating Indirect User Requests

ACL ARR 2024 June Submission554 Authors

12 Jun 2024 (modified: 08 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: - Indirect User Requests (IURs), such as "It's cold in here" instead of "Could you please increase the temperature?" are common in human-human task-oriented dialogue and require world knowledge and pragmatic reasoning from the listener. While large language models (LLMs) can handle these requests effectively, smaller models deployed on virtual assistants often struggle due to resource constraints. Moreover, existing task-oriented dialogue benchmarks lack sufficient examples of complex discourse phenomena such as indirectness. To address this, we propose a set of linguistic criteria along with an LLM-based pipeline for generating realistic IURs to test natural language understanding (NLU) and dialogue state tracking (DST) models before deployment in a new domain. We also release IndirectRequests, a dataset of IURs based on the Schema Guided Dialog (SGD) corpus, as a comparative testbed for evaluating the performance of smaller models in handling indirect requests.

Paper Type: Long

Research Area: Dialogue and Interactive Systems

Research Area Keywords: evaluation and metrics, task-oriented, commonsense reasoning, dialogue state tracking, automatic evaluation, NLP datasets

Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Data resources, Data analysis

Languages Studied: English

Submission Number: 554

Loading