RSI for Science: A Verifier-First Framework for AI Scientists

Published: 30 May 2026, Last Modified: 30 May 2026ICML2026-AI4Science PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: Track 1: Original Research/Position/Education/Attention Track
Keywords: recursive self-improvement, AI scientists, Verifier-Governed Recursive Scientific Refinement, VGRSR, empirical science, external disconfirmation gates, path-sensitive stability, verifier-efficiency accounting, proxy drift, latency starvation, multi-fidelity optimization, robust control, maximum drawdown, drawdown CVaR, scientific recursion, cost-accounted cycles, reward hacking, automated discovery
TL;DR: AI scientists claimed to be recursive self-improvers face proxy drift in empirical science. We propose Verifier-Governed Recursive Scientific Refinement (VGRSR), requiring external gates, costed cycles, and path-sensitive risk metrics.
Abstract: AI scientists are increasingly framed as recursive self-improvers: systems that generate hypotheses, choose experiments, revise tools, store lessons, and improve future campaigns. We argue that this language is misleading for empirical science unless recursion is externally verifier-governed. In empirical domains, the verifier is not a compiler, theorem checker, or game rule engine; it is a noisy, delayed, costly, and sometimes destructive physical, statistical, or procedural test. We propose Verifier-Governed Recursive Scientific Refinement (VGRSR): a standard that credits scientific recursion only when reusable objects of the scientific search process—hypotheses, tools, verifiers, memory, and campaign policy—change through cost-accounted cycles that pass independent gates. VGRSR adds four requirements to AI-scientist evaluation: external disconfirmation gates, per-cycle provenance, path-sensitive stability metrics, and verifier-efficiency accounting. We support the position with mechanism demonstrations of proxy drift and latency starvation, a five-object taxonomy, a control/risk framing using sensitivity, maximum drawdown, and drawdown CVaR, and vignettes spanning materials, cosmology, nanobiomaterials, and neurodegeneration. The practical standard is simple: no external gate, no cost-accounted cycle log, no stability report, no scientific RSI claim.
Submission Number: 92
Loading