LiteHall: A Three-Stage, Modular and Lightweight Pipeline for End-to-End Hallucination Detection

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Hallucination Detection, Large Language Models (LLMs), Small Language Models (SLMs), RLVR, LiteHall, HaFin500, Modularity
Abstract: Large Language Models (LLMs) are increasingly applied in high-stakes domains such as medicine and law, where hallucinations can have serious consequences. Existing detection approaches either depend on costly proprietary LLMs with limited adaptability, or on monolithic open-source models that require full retraining, struggle with long evidence contexts, and lack transparency. We introduce LiteHall, a lightweight, fully open-source, three-stage hallucination detection pipeline designed for modularity, domain adaptability, and interpretability. Each stage leverages a 1.7B-parameter Small Language Model (SLM) trained independently with stage-specific Reinforcement Learning with Verifiable Rewards (RLVR) over a high-quality synthetic corpus of 120K+ examples, enabling efficient specialization without reliance on large monolithic models. To advance rigorous evaluation, we present HaFin500, a fine-grained benchmark of 500 long-form QA pairs spanning 30 fact-seeking domains, annotated with 6K claims and 3.5M evidence tokens. Extensive experiments show that LiteHall consistently surpasses both open-source and proprietary detectors. On out-of-domain benchmarks, LiteHall achieves substantial gains over strong baselines, including +6.4% / +10.0% (Accuracy/F1) against MiniCheck-7B, +6.1% / +4.8% over SAFE (GPT-3.5-turbo), +11.5% / +13.0% over AlignScore, and +9.8% / +15.2% over FAVA. Even compared to GPT-4o, LiteHall delivers +4.7% / +3.0% improvements in zero-shot mode, while retaining an additional +2.0% / +0.9% advantage when GPT-4o is integrated as a backbone. These results demonstrate that LiteHall not only matches or exceeds in-domain performance but also generalizes robustly out-of-domain, establishing it as a practical, transparent, and reproducible solution for trustworthy LLM deployments.
Supplementary Material: zip
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 13882
Loading