Track: long paper (up to 10 pages)
Keywords: hallucination, logical constraints, support set, hard masking, formal guarantees
TL;DR: Softmax-based mitigation can only shrink errors, not eliminate invalid outputs; we propose a Values→Support continuum and three layers of logical intervention culminating in structural hard masking for formal guarantees.
Abstract: Hallucination mitigation for large language models (LLMs) is approaching the point of diminishing marginal returns: prompt engineering, retrieval-augmented generation (RAG), and reinforcement learning from human feedback (RLHF) can drive error probabilities extremely low, yet they cannot break through a structural non-zero lower bound. We trace this limit to an epistemological tension between probabilistic likelihood (Likelihood; used here in the broad sense of “probabilistic plausibility,” not the statistical likelihood function) and logical necessity (Necessity): the mathematical nature of standard Softmax with finite logits and without hard masking (for any token, $P(t)>0$ holds) conflicts irreconcilably with logical validity (which requires exact exclusion of certain outputs). To address this, we propose the Values→Support continuum as an analytic framework. Along the information path (changing probability values), a mathematical ceiling is unavoidable; along the structural path (hard masking to reconstruct the feasible set $V_{\text{valid}}$), the probability of logically invalid tokens can be made exactly zero. Building on this, we construct three layers of logical intervention—information (empiricism), normative (rationalism), and structural (formalism)—forming a defense in depth from statistical enhancement to formal guarantees. Rather than providing a final engineering solution, this paper establishes a research agenda to shift AI reliability from soft alignment toward hard guarantees, enabling a paradigm shift in how we govern hallucinations.
Presenter: ~Kun_Yuan7
Format: No, the presenting author is unable to, or unlikely to be able to, attend in person.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Funding: No, the presenting author of this submission does *not* fall under ICLR’s funding aims, or has sufficient alternate funding.
Submission Number: 77
Loading