Explainable Span-level Hallucination Detection Dataset Construction via Iterative Self-refinement

Explainable Span-level Hallucination Detection Dataset Construction via Iterative Self-refinement

ACL ARR 2026 January Submission9705 Authors

06 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Span-Level Hallucination Detection, Automated Dataset Construction, Explainability

Abstract: Effective span-level hallucination detection is crucial for the trustworthy deployment of Large Language Models (LLMs), relying on high-quality datasets as its foundational prerequisite. To alleviate the high cost of span-level manual annotation, prior work has turned to automatic dataset construction. Existing automatically constructed span-level datasets are typically created by synthesizing hallucinated claims via span perturbations with label annotations. However, such synthetic hallucinated claims often contain lots of surface-level flaws such as disfluency, incoherence, or inconsistency, which differ from the fluent, coherent, consistent yet factually incorrect hallucinations produced by real-world LLMs; meanwhile, label-only annotations lack explainability and practical utility. To tackle the above problems, in this paper, we propose a novel self-refinement framework designed to generate more realistic hallucinated claims and explainable annotations by leveraging structured hallucination-inducing edit suggestions and a generate–critique–refine loop mechanism, resulting in a high-quality span-level dataset with explainable annotations, named ExpHul. Moreover, we train a compact hallucination detection model on ExpHul. Extensive experiments show that the detection model outperforms state-of-the-art baselines while providing high-quality explanations and revisions. Furthermore, the 8B-parameter detection model achieves competitive performance with the best closed-source LLMs like Gemini-3.0-Pro and GPT-5.2 at a lower inference cost.

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: fact checking, rumor/misinformation detection

Contribution Types: NLP engineering experiment, Data resources

Languages Studied: English

Submission Number: 9705

Loading