Keywords: Underspecified Instructions, User Goal, Clarification Policy, Preference Optimization
Abstract: Tool-using language agents often struggle under underspecified user instructions due to uncertainty about the user’s true goal. We propose a plug-and-play Clarifier-augmented training and evaluation framework built on $\tau$-Bench, in which a dedicated Clarifier is triggered after tool calls to decide whether and how to ask follow-up questions, without modifying the agent’s internal policy. We compare supervised fine-tuning (SFT), direct preference optimization (DPO), and a user-goal-driven group-wise preference optimization (GRPO) method that trains the Clarifier using goal-aware preference signals conditioned on the ground-truth user requirement. Experiments across domains and agent backbones show that learned clarification consistently improves task success, with user-goal driven GRPO achieving among the strongest cross-domain generalization results, including robust gains on the out-of-distribution airline domain. In cross-agent evaluations over six agent backbones, the learned Clarifier improves success rate by 5.2\% on average while increasing average interaction steps by only 0.1, demonstrating the effectiveness of a learned, plug-and-play clarification policy with minimal interaction overhead. Our code and data are available at the anonymous repository: \url{https://anonymous.4open.science/r/submission-E65F}.
Paper Type: Long
Research Area: AI/LLM Agents
Research Area Keywords: Dialogue and Interactive Systems, Human-Centered NLP, Machine Learning for NLP, NLP Applications
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Approaches to low-resource settings, Data resources
Languages Studied: English
Submission Number: 5247
Loading