Keywords: Large Language Models, Hallucination Detection, Chain-of-Thought Reasoning
Abstract: Long chain-of-thought (CoT) reasoning improves the performance of large language models, yet hallucinations in such settings often emerge subtly and propagate across reasoning steps.
We suggest that hallucination in long CoT reasoning is better understood as an evolving latent state rather than a one-off erroneous event.
Accordingly, we treat step-level hallucination judgments as local observations and introduce a cumulative prefix-level hallucination signal that tracks the global evolution of the reasoning state over the entire trajectory.
Overall, our approach enables streaming hallucination detection in long CoT reasoning, providing real-time, interpretable evidence.
Paper Type: Long
Research Area: Safety and Alignment in LLMs
Research Area Keywords: Interpretability and Analysis of Models for NLP, Language Modeling
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 3791
Loading