AmbiK: Dataset of Ambiguous Tasks in Kitchen Environment

ACL ARR 2024 December Submission2003 Authors

16 Dec 2024 (modified: 05 Feb 2025)ACL ARR 2024 December SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: As a part of an embodied agent, Large Language Models (LLMs) are typically used for behavior planning given natural language instructions from the user. However, dealing with ambiguous instructions in real-world environments remains a challenge for LLMs. Various methods for task ambiguity detection have been proposed. However, it is difficult to compare them because they are tested on different datasets, and there is no universal benchmark. For this reason, we propose AmbiK **Ambi**guous Tasks in **K**itchen Environment, the fully textual dataset of ambiguous instructions addressed to a robot in a kitchen environment. AmbiK was collected with the assistance of LLMs and is human-validated. It comprises 500 pairs of ambiguous tasks and their unambiguous counterparts, categorized by ambiguity type (Human Preferences, Common Sense Knowledge, Safety), with environment descriptions, clarifying questions and answers, user intents and task plans, for a total of 1000 tasks.
Paper Type: Long
Research Area: Dialogue and Interactive Systems
Research Area Keywords: evaluation and metrics, task-oriented, commonsense reasoning, embodied agents
Contribution Types: Data resources
Languages Studied: English
Submission Number: 2003
Loading