Efficient Hallucination Detection for LLMs Using Uncertainty-Aware Attention Heads

Artem Vazhentsev; Lyudmila Rvanova; Gleb Kuzmin; Ekaterina Fadeeva; Ivan Lazichny; Alexander Panchenko; Maxim Panov; Timothy Baldwin; Mrinmaya Sachan; Preslav Nakov; Artem Shelmanov

Efficient Hallucination Detection for LLMs Using Uncertainty-Aware Attention Heads

Artem Vazhentsev, Lyudmila Rvanova, Gleb Kuzmin, Ekaterina Fadeeva, Ivan Lazichny, Alexander Panchenko, Maxim Panov, Timothy Baldwin, Mrinmaya Sachan, Preslav Nakov, Artem Shelmanov

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Hallucination detection, Large language models, Uncertainty quantification, Selective generation, Attention mechanisms

TL;DR: We introduce a new unsupervised method for hallucination detection for large language models, which integrates attention weights and token probabilities.

Abstract: Recent progress in large language models (LLMs) has led to systems capable of producing text with remarkable fluency. However, these models are still prone to factual inaccuracies, often referred to as \``hallucinations''. One strategy to alleviate this issue is uncertainty quantification (UQ), but most existing approaches are computationally intensive or require supervision. In this work, we propose Recurrent Attention-based Uncertainty Quantification (RAUQ), an unsupervised and efficient framework for identifying hallucinations. The method leverages an observation about transformer attention behavior: when incorrect information is generated, certain \``uncertainty-aware'' attention heads, tend to reduce their focus on preceding tokens. RAUQ automatically detects these attention heads and combines their activation patterns with token-level confidence measures in a recurrent scheme, producing a sequence-level uncertainty estimate in just a single forward pass. Through experiments on twelve tasks spanning question answering, summarization, and translation across four different LLMs, we show that RAUQ consistently outperforms state-of-the-art UQ baselines. Importantly, it does so with minimal cost, less than 1\% additional computation. Since it requires neither labeled data nor extensive parameter tuning, RAUQ serves as a lightweight, plug-and-play solution for real-time hallucination detection in white-box LLMs.

Supplementary Material: zip

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Submission Number: 24110

Loading