Unraveling Hallucination in Large Reasoning Models: A Topological Perspective

15 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Hallucination, Large Reasoning Model
Abstract: Large Reasoning Models (LRMs) have recently demonstrated strong capabilities in multi-step problem solving through extended chain-of-thought (long-CoT) and self-reflective reasoning. However, the very reliance on long reasoning chains makes them vulnerable to hallucinations, where early-stage errors become amplified and embedded within otherwise coherent logical traces. Existing hallucination detection methods largely focus on short-CoT models, leaving the unique challenges of LRMs underexplored. In this paper, we propose a topological perspective to \textit{analyze}, \textit{detect}, and \textit{mitigate} hallucinations in LRMs. \textbf{(I) Analyze}: We formalize reasoning trajectories as structured graphs and conduct statistical analysis on 6,000+ annotated reasoning graphs, revealing 17 topological features that reliably distinguish hallucinated from faithful reasoning. \textbf{(II) Detect}: Building on these insights, we develop G-Detector, a graph-based post-hoc hallucination detector that leverages only reasoning topology and achieves up to $88.9\\%$ detection accuracy. \textbf{(III) Mitigate}: We extend G-Detector to mitigation by filtering high-risk reasoning traces during cold-start supervised fine-tuning in the LRM training process, which improves the LRM's factual accuracy by $13.8\\%$ without impairing reasoning ability. Studies showcase that hallucinations in LRMs are not arbitrary but leave identifiable structural signatures in their reasoning topologies, opening a principled pathway toward reliable detection and prevention of LRM hallucinations.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 6130
Loading