ClaimGarden: Update-Aware Claim-State Control for AI Scientist Workflows

Published: 30 May 2026, Last Modified: 30 May 2026ICML2026-AI4Science PosterEveryoneRevisionsBibTeXCC BY 4.0
Track: Track 3: AI Scientist Proposal Competition
Keywords: AI Scientist, scientific agents, claim-state control, epistemic governance, evidence provenance, biological databases, literature updates, automated laboratory data, molecular biology, structural bioinformatics, claim verification, manuscript export gates, evidence drift
TL;DR: ClaimGarden turns project- or paper-shaped AI scientist memory into update-aware claim-state control, linking claims to versioned database, literature, computational, and automated-lab evidence and gating manuscripts as that evidence changes.
Abstract: AI scientist workflows can now generate hypotheses, run analyses, revise plans, and draft plausible papers. In data-intensive molecular biology, however, evidence changes continuously: public database releases, annotation revisions, new papers, predicted-structure resources, agent-generated analyses, and automated laboratory measurements can all alter which claims are supported, overbroad, contradicted, or obsolete. A project- or paper-shaped memory can therefore leave individual claims untested, outdated, or silently promoted from prediction to experiment. ClaimGarden shifts the unit of automation from projects and manuscripts to evolving claim states. It is an update-aware claim-state control layer for semi-autonomous AI scientists: claims are harvested from agent outputs, normalized into auditable units, linked to versioned database, literature, computational, or laboratory evidence, revalidated after evidence updates, adjudicated by policy from verifier recommendations, and used to gate manuscript export and follow-up tasks. We demonstrate the approach in structural bioinformatics, where updates to experimental-structure and predicted-structure evidence narrow a structural-coverage claim while blocking a prediction-as-experiment overclaim. ClaimGarden does not certify truth; it records who or what judged a claim, from which evidence, under which policy, and why the state changed, turning evidence drift into a measurable governance signal for self-correcting AI science.
Submission Number: 240
Loading