Abstract: Users often assume that large language models (LLMs) share their cognitive alignment of context and intent, leading them to omit critical information in question-answering (QA) and produce ambiguous queries. Responses based on misaligned assumptions may be perceived as hallucinations. Therefore, identifying possible implicit assumptions is crucial in QA. We propose Conditional Ambiguous Question-Answering (CondAmbigQA), a benchmark comprising 200 ambiguous queries and condition-aware evaluation metrics. Our study pioneers the concept of "conditions" in ambiguous QA tasks through retrieval-based annotation, where conditions represent contextual constraints or assumptions that resolve ambiguities. The retrieval-based strategy uses retrieved Wikipedia fragments to identify possible interpretations for a given query as its conditions and annotate the answers accordingly. Experiments show that models considering conditions before answering improve answer accuracy by 19\%, with an additional 5\% gain when conditions are explicitly provided. These results underscore the value of conditional reasoning in QA, offering researchers tools for rigorous ambiguity resolution evaluation.
Paper Type: Long
Research Area: Question Answering
Research Area Keywords: knowledge base QA, reasoning, benchmarking, open-domain QA, NLP datasets, reproducibility
Contribution Types: Data resources
Languages Studied: English
Submission Number: 1183
Loading