Neural Diversity Regularizes Hallucinations in Language Models

Neural Diversity Regularizes Hallucinations in Language Models

ICLR 2026 Conference Submission22209 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: hallucination suppression

TL;DR: Neural diversity—de-correlated parallel streams—provably and empirically reduces hallucinations in LLMs at fixed parameter and data budgets.

Abstract: Large language models continue to hallucinate despite increases in parameters, compute, and data. We propose neural diversity as a principled mechanism to reduce hallucination rates at fixed budgets. Our theory establishes a predictive bound connecting spectral diversity to margin variance and hallucination probability and anticipates a non‑monotonic scaling regime where naively increasing parallelism can worsen reliability. We validate these predictions with parameter- and data-matched experiments across Q&A, summarization and other benchmarks. Sensitivity analyses show increasing neural diversity reduces hallucinations, ablations demonstrate that a Barlow Twins-based augmented loss and low-rank layers are the primary drivers of impact, and stabilization techniques extend improvements up to multi-billion parameter models. Together, our results highlight neural diversity as an independent axis of scaling — orthogonal to parameters and tokens — that regularizes hallucinations without additional model or data cost. We release proofs, code, and reproducibility artifacts to encourage further study.

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Submission Number: 22209

Loading