CLARA: Clarification-Driven Measurement of Input Ambiguity in LLMs

17 Sept 2025 (modified: 29 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM, Ambiguity
TL;DR: Measuring Input Ambiguity in LLM Via Generated Clarifications
Abstract: Large Language Models (LLMs) perform well on question-answering tasks with well-specified inputs, but real-world queries are often vague or underspecified, leading to ambiguity and unreliable responses. Existing methods for ambiguity detection typically use a two-stage framework: (a) generating multiple clarifying reformulations of the input, and (b) answering each version to assess ambiguity based on the variation in responses. We introduce CLARA, a novel and complementary approach that quantifies ambiguity using only the clarification generation phase. We hypothesize that ambiguous inputs elicit a greater number and diversity of clarifications. CLARA estimates ambiguity by measuring the semantic dispersion of these LLM-generated clarifications, without requiring subsequent answering. This method requires no additional task-specific training, relying instead on an off-the-shelf similarity model, and thus offers two key benefits: (1) it is lightweight—reducing API calls and computational cost, and (2) it is more robust across LLMs—avoiding dependence on model-specific factual knowledge and reducing susceptibility to hallucinations. Empirical results across multiple LLMs and benchmark datasets demonstrate that CLARA provides an intuitive, scalable, and effective alternative to answer-based techniques, achieving comparable or superior performance.
Primary Area: applications to computer vision, audio, language, and other modalities
Supplementary Material: zip
Submission Number: 8894
Loading