Self-Adapting Agents for Automating Research Coding Workflows

Published: 05 Mar 2026, Last Modified: 14 Mar 2026ICLR 2026 Workshop RSI PosterEveryoneRevisionsCC BY 4.0
Keywords: LLM, AI4Research, Self-Adapting Agent, Context Engineering
TL;DR: SARE is a self-adapting LLM agent that revises its own prompts, plans, and tool use to reproduce research code. Using feedback from past runs, it boosts success on SUPER-Bench and ResearchCodeBench , outperforming all prior systems.
Abstract: Existing prompt‑optimization techniques use only local signals to update behavior, neglect broader, recurring patterns across tasks, leading to poor generalization; they often rely on full‑prompt rewrites or unstructured merges, causing knowledge loss. These limitations are magnified in research‑coding workflows, characterized by heterogeneous repositories, underspecified environments, and weak feedback; where reproducing results from public codebases is an established evaluation regime. We introduce Self‑Adapting Research Engineer (SARE), a framework that learns from Global Training Context, cross‑repository execution trajectories recognizes recurring failure modes, distills them into reusable heuristics, and performs targeted edits over configurable fields: the system prompt, a task‑prompt template, and a cumulative cheatsheet preserving validated instructions while incrementally adding strategies. SARE, via this reflective prompt‑optimization framework, improves performance over prior state‑of‑the‑art human performance by 23.6% on SUPER, 3.5% on ResearchCodeBench and 7.1% on ScienceAgentBench across respective metrics, surpassing prior prompt‑optimization technique.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 127
Loading