Self-Improvement via Fast Tree-search

Xinghong Fu; Aravinth Kulanthaivelu; Yutaro Yamada

Self-Improvement via Fast Tree-search

Xinghong Fu, Aravinth Kulanthaivelu, Yutaro Yamada

Published: 05 Mar 2026, Last Modified: 07 Mar 2026ICLR 2026 Workshop RSI PosterEveryoneRevisionsCC BY 4.0

Keywords: LLM, agents, self-improvement

Abstract: Coding agents can recursively modify their own implementations, forming a loop of self improvement. While prior work shows this can boost performance on coding benchmarks, existing approaches are costly and compute-intensive. We introduce a simple, sample-efficient self-improvement framework that significantly improves coding performance under strict budget constraints. We identify evaluation of candidate self-modifications as the main bottleneck since prior approaches estimate their effectiveness by re-running a subset of benchmark tasks with the modified agent. We replace these partial benchmark runs with an LLM-as-a-judge signal to rank candidate patches, reserving expensive evaluations only for the most promising ones. Combined with a lightweight tree search, this enables effective exploration with minimal overhead. On a subset of SWE-bench Verified, our method improves gpt-5-mini pass rate by over 11\% in fewer than three self-improvement steps, using just \$25 in API costs within 15 CPU hours.

Email Sharing: We authorize the sharing of all author emails with Program Chairs.

Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.

Submission Number: 126

Loading