Agent Systems for Academic Research Automation

Published: 30 May 2026, Last Modified: 30 May 2026ICML2026-AI4Science PosterEveryoneRevisionsBibTeXCC BY 4.0
Additional Submission Instructions: For the camera-ready version, please include the author names and affiliations, funding disclosures, and acknowledgements.
Track: Track 1: Original Research/Position/Education/Attention Track
Keywords: AI scientists, research agents, academic research automation, verifier-matched autonomy, scientific verification, claim-level evaluation, provenance, governance
TL;DR: AI scientist systems should be evaluated by the artifacts they produce and the verifiers available for their claims, not by agent count or nominal autonomy.
Abstract: AI scientist systems increasingly propose hypotheses, plan experiments, execute analyses, draft papers, and participate in review-like workflows. As these systems move from local assistance toward producing claim-bearing scientific artifacts, aggregate descriptors such as agent count, tool use, or autonomy no longer specify what a system contributes, what artifact it produces, or what can verify the claims it helps make. We introduce verifier-matched autonomy: the principle that a system's apparent autonomy is meaningful only relative to the strength, cost, and latency of the verifier available for those claims. We develop a three-dimensional framework organized by research-lifecycle coverage, artifact type, and verification regime, and use it to position citation-grounded writing systems, survey generators, full-paper research pipelines, review agents, and domain-specific discovery systems. This framework treats labels such as tool, co-author, and founder as workflow-specific roles rather than system-wide properties. We derive implications for evaluation, provenance, and governance, emphasizing closure failure: the risk that claim-bearing scientific artifacts appear settled before their underlying claims have been adequately verified.
Submission Number: 279
Loading