Structured Uncertainty guided Clarification for LLM Agents

ICLR 2026 Conference Submission16336 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: ambiguity, llm agents
Abstract: LLM agents extend large language models with the ability to perform real-world actions through tool calls, but ambiguous or incomplete user instructions often lead to incorrect invocations, failed tasks, and degraded user experience. We introduce a \textbf{principled formulation of structured uncertainty} over tool-call parameters, modeling joint tool--argument clarification as a POMDP. By optimizing an Expected Value of Perfect Information objective, our approach selects clarification questions that maximize expected task success while an aspect-based cost function prevents redundant questioning. Building on this formulation, \textbf{SAGE-Agent} leverages structured uncertainty to improve interaction efficiency and task coverage, increasing coverage on ambiguous tasks by 7--39\% and reducing the number of clarification questions by 1.5--2.7$\times$ over strong prompting- and uncertainty-based baselines. To support evaluation, we present \textit{ClarifyBench}, the first benchmark for multi-turn, tool-augmented disambiguation, equipped with an LLM-based user simulator that enables realistic conversational progression across diverse domains including document editing, vehicle control, stock trading, travel booking, and file system manipulation. Finally, we show that structured uncertainty serves as an effective reward model for reinforcement learning; on the When2Call dataset, uncertainty-weighted training boosts accuracy from 36.5\% to 65.2\% for the 3B model and 36.7\% to 62.9\% for the 7B model. These results demonstrate that structured uncertainty provides a principled, efficient approach for tool-augmented LLM agents, improving both task success and interaction efficiency in multi-turn, real-world scenarios.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 16336
Loading