EVIDENCE-GATED SCIENTIFIC QA WITH EXPLICIT ABSTENTION AND PAGE-LEVEL PROVENANCE

Published: 03 Mar 2026, Last Modified: 26 Apr 2026ICLR 2026 Workshop FM4Science PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Scientific Question Answering, Evidence-Gated Generation, Retrieval-Augmented Generation, Abstention, Verifiable Provenance, AI for Science, Selective Prediction, Document AI
TL;DR: A scientific QA system that treats answering as a decision problem, generating responses only when supported by verifiable evidence and abstaining otherwise.
Abstract: Large Language Models have demonstrated strong performance in scientific question-answering tasks, particularly when combined with retrieval-based mechanisms. However, in high-risk scientific domains, reliability depends not only on access to external knowledge, but on the system’s ability to determine when answering a question is epistemically justified. Existing Retrieval-Augmented Generation pipelines primarily address knowledge access, but lack explicit decision policies governing when generation should be authorized or withheld under insufficient evidence. In this work, we introduce Pororoca, an Evidence-Gated Scientific QA system that treats question answering as a system-level decision problem. Pororoca conditions generation on the explicit sufficiency of verifiable scientific evidence and enforces abstention otherwise, producing only answers accompanied by auditable provenance at the document and page level. The system operates on a scientific corpus automatically structured by a large-scale Document AI pipeline and implements a deterministic, threshold-based decision policy separating conditional generation from explicit abstention. We describe the system architecture, decision logic, and an epistemically auditable evaluation protocol designed to assess evidence-based factuality, citation quality, and selective risk under realistic retrieval noise. By framing scientific QA reliability as a property of explicit decision policy rather than model behavior alone, this work contributes a principled system-level approach to verifiable and reliable scientific question answering.
Submission Number: 97
Loading