Keywords: Scientific Question Answering, Evidence-Gated Generation, Retrieval-Augmented Generation, Abstention, Verifiable Provenance, AI for Science, Selective Prediction, Document AI
TL;DR: A scientific QA system that treats answering as a decision problem, generating responses only when supported by verifiable evidence and abstaining otherwise.
Abstract: Large Language Models have demonstrated strong performance in scientific
question-answering tasks, particularly when combined with retrieval-based
mechanisms. However, in high-risk scientific domains, reliability depends not
only on access to external knowledge, but on the system’s ability to determine
when answering a question is epistemically justified. Existing
Retrieval-Augmented Generation pipelines primarily address knowledge access,
but lack explicit decision policies governing when generation should be
authorized or withheld under insufficient evidence. In this work, we introduce
Pororoca, an Evidence-Gated Scientific QA system that treats question answering
as a system-level decision problem. Pororoca conditions generation on the
explicit sufficiency of verifiable scientific evidence and enforces abstention
otherwise, producing only answers accompanied by auditable provenance at the
document and page level. The system operates on a scientific corpus
automatically structured by a large-scale Document AI pipeline and implements a
deterministic, threshold-based decision policy separating conditional
generation from explicit abstention. We describe the system architecture,
decision logic, and an epistemically auditable evaluation protocol designed to
assess evidence-based factuality, citation quality, and selective risk under
realistic retrieval noise. By framing scientific QA reliability as a property of
explicit decision policy rather than model behavior alone, this work contributes
a principled system-level approach to verifiable and reliable scientific
question answering.
Submission Number: 97
Loading