Continual Pre-Training for Hallucination Reduction

ICLR 2026 Conference Submission23025 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: GRPO, MRC, reading comprehension
Abstract: Hallucinations, where model outputs contradict or cannot be verified by the provided evidence, remain a central obstacle to the reliable use of large language models (LLMs). Thus, it remains to be seen how hallucination can be decreased via sophisticated training methods. Prior works find that the mismatch between pre-training dataset and fine-tuning dataset is main cause for hallucinations. To reduce such effect, we introduce Continual Pre-Training for Hallucinations (CPTHalu), a method that performs fine-tuning of a sample in parallel with continued pre-training of its corresponding factual knowledge. We adapt GRPO to reading comprehension via our training scheme, a first effort for RL knowledge fine-tuning in reading comprehension to our understanding. Our experiments on HaluEval and SQuAD obtain large and consistent performance increases of up to 17 points. To further assess factual grounding, we also perform ablation study with our new Augmented QA benchmarks, being novel question-answer pairs over the same source documents. We obtain improvements for both closed-book and open-book performance. We also validate scalability on smaller models, showing that CPTHalu’s benefits persist under limited capacity. Our results establish CPTHalu as a simple yet effective strategy for mitigating hallucinations in LLMs. Our code and dataset will be released upon publication.
Primary Area: generative models
Submission Number: 23025
Loading