HalluFormer: A Transformer-Based Framework for Detecting Hallucination in Large Language Models

Bhanu Prakash Vangala

Published: 27 May 2025, Last Modified: 27 Mar 2026AAAI 2025 Spring Series -- AI for Scientific DiscoveryEveryoneCC BY 4.0

Abstract: Despite the impressive performance of Large Language Models (LLMs) in a variety of natural language processing tasks, they are still prone to producing information that is factually inaccurate, known as hallucination. In critical fields related to scientific and clinical domains that demand highly precise answers, the negative effect of this phenomenon is even more pronounced. To address this problem, we formulated the hallucination detection problem as a classification problem of assessing the consistency between questions, answers, and retrieved knowledge contexts, and proposed HalluFormer, a transformer-based model for detecting hallucinations of LLMs. HalluFormer was trained and tested on the MultiNLI dataset. It achieves an F1 score of 0.9471 on the MultiNLI test dataset. On the blind ANAH test dataset, it achieves an F1 score of 0.7285, indicating it can generalize reasonably well to completely new data. The results demonstrate that transformer-based methods can be utilized to detect hallucinations of LLMs, paving the way for further research on improving the reliability of LLMs