Clicking the Void: LLM Agent is Hallucinating and Where to Find Them

Clicking the Void: LLM Agent is Hallucinating and Where to Find Them

ACL ARR 2025 May Submission7557 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Hallucinations pose critical risks in large language model (LLM)-based agents. When outputs are inconsistent with the contextual or environmental reality, they manifest incorrect or harmful actions. While recent study have exposed such failures, existing evaluations remain fragmented and lack a principled testbed. In this paper, we present the first unified benchmarking framework for eliciting and evaluating hallucinations in interactive LLM-agent scenarios. We begin by introducing a three-part taxonomy to address agentic hallucinations: actions that are unfaithful to (i) task instructions, (ii) execution history, or (iii) environment observations. To analyze, we first elicit such failures by performing a systematic audit of existing agent benchmarks, then synthesize test cases using a snapshot strategy that isolates decision points in deterministic and reproducible manners. To evaluate hallucination behaviors, we adopt the LLM-as-a-Judge paradigm with tailored risk-aware prompts, enabling scalable, high-fidelity assessment of agent actions without enumerating full action spaces. Our framework provides actionable insights on failure modes of LLM agents and lays the groundwork for principled progress in mitigating hallucinations in interactive environments.

Paper Type: Long

Research Area: Resources and Evaluation

Research Area Keywords: Hallucination, LLM agents, Benchmark

Languages Studied: English

Submission Number: 7557

Loading