Reducing Hallucinations in Generative Models through Truncated Statistics

Reducing Hallucinations in Generative Models through Truncated Statistics

ICLR 2026 Conference Submission19980 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Query-efficient learning, Hallucination mitigation, Projected stochastic gradient descent (PSGD), Active learning / membership queries

Abstract: Hallucinations—where generative models produce invalid or nonsensical outputs—remain a critical challenge for reliable deployment. We present the first computationally and query-efficient algorithm that provably addresses the hallucination problem by actively querying the model’s own invalid outputs. Specifically, we impose a strict constraint on the hallucination rate while maximizing the likelihood of valid target examples via projected stochastic gradient descent. Our method works in very general settings with arbitrary distributions parameterized by sufficiently expressive exponential families. Our approach is enabled by a novel connection to the field of truncated statistics and settles an open problem posed by Hanneke et al.~\yrcite{pmlr-v75-hanneke18a}.

Primary Area: learning theory

Submission Number: 19980

Loading