Keywords: Large Language Model, LLM Agent, Large Reasoning Model, Tool Hallucination
TL;DR: Strengthening reasoning in LLM agents increases tool hallucination
Abstract: Enhancing the reasoning capabilities of Large Language Models (LLMs) is a key strategy for building agents that ''think then act''. However, recent observations, like OpenAI's o3, suggest a paradox: stronger reasoning often coincides with increased hallucination, yet **no prior work has systematically examined whether reasoning enhancement itself causes tool hallucination**. We address this gap with the central question: **Does strengthening reasoning increase tool hallucination?** To answer this, we introduce ***SimpleToolHalluBench***, a diagnostic benchmark measuring tool hallucination in two failure modes: (i) no tool available, and (ii) only distractor tools available. Through controlled experiments, we establish three key findings. First, we demonstrate a causal relationship: progressively enhancing reasoning through RL increases tool hallucination proportionally with task performance gains. Second, this effect transcends overfitting—training on non-tool tasks (e.g., mathematics) still amplifies subsequent tool hallucination. Third, the effect is method-agnostic, appearing when reasoning is instilled via supervised fine-tuning and when it is merely elicited at inference by switching from direct answers to step-by-step thinking. We also evaluate mitigation strategies including Prompt Engineering and Direct Preference Optimization (DPO), revealing a fundamental **reliability–capability trade-off**: reducing hallucination consistently degrades utility. Mechanistically, Reasoning RL disproportionately collapses tool-reliability–related representations, and hallucinations surface as amplified divergences concentrated in late-layer residual streams. These findings reveal that **current reasoning enhancement methods inherently amplify tool hallucination**, highlighting the need for new training objectives that jointly optimize for capability and reliability. Our implementation is provided at https://anonymous.4open.science/r/Reasoning_Trap-E5B6/.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 11597
Loading