Hallucinations vs. Predictions: Reframing Uncertainty in LLM-Generated Medical Responses

Published: 07 May 2025, Last Modified: 07 May 2025MLGenX 2025 TinyPapersEveryoneRevisionsBibTeXCC BY 4.0
Track: Tiny paper track (up to 5 pages)
Abstract: Large Language Models (LLMs) are increasingly used in medicine, but the traditional factual/hallucinatory distinction fails to reflect the evolving nature of medical knowledge. This paper critiques that binary and proposes a refined, three-tiered classification: (1) Currently Verifiable Responses, (2) Tentatively Examinable Responses, and (3) Predictive Responses. This framework introduces a veridicality gradient and emphasizes temporal verifiability, enabling more accurate evaluation, reducing clinical risk, and supporting adaptive model calibration. Ultimately, it promotes the development of safer and more epistemically responsible medical AI systems.
Submission Number: 83
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview