Abstract: Language models continue to hallucinate despite increases in parameters, compute, and data. We propose neural diversity — decorrelated parallel representations — as a provable mechanism to reduce hallucination rates in our setting at fixed parameter and data budgets. While existing mitigation strategies largely target accuracy, we provide the first formal tail bounds for hallucination probability in ensembled language models, reframing it as a second-moment reliability problem governed by representational covariance and explaining 94.3% of empirical reliability variation seen across parallel configurations in our experimental settings. We introduce ND-LoRA (Neural Diversity Low-Rank Adaptation), combining parallel LoRA adapters with Barlow Twins regularization, and reduce hallucinations by up to 25.6% (and 14.6% on average) on evaluated benchmarks while preserving general accuracy. Ablations show LoRA adapters and regularization act synergistically, causal interventions prove neurodiversity as the mediating factor and correlational studies indicate scale: a 0.1% neural correlation increase is associated with a 3.8% hallucination increase. Finally, task-dependent optimality emerges: different tasks require different optimal amounts of neurodiversity. Neural diversity enables reliability gains without scaling compute — improving tail behavior orthogonally to parameters and data at near-zero additional cost.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Akanksha_Saran1
Submission Number: 7960
Loading