Keywords: large language models, AI for research
TL;DR: We propose AlphaResearch, an autonomous research agent designed to discover out-of-boundary algorithms.
Abstract: Large language models have made significant progress in complex but easy-to-
verify problems, yet they still struggle with discovering the unknown. In this
paper, we present AlphaResearch, an autonomous research agent designed to dis-
cover new algorithms on open-ended problems by iteratively running the follow-
ing steps: (1) propose new ideas (2) program to verify (3) optimize the research
proposals. To synergize the feasibility and innovation of the discovery process,
we construct a new reward environment by combining the execution-based verifi-
able reward and reward from simulated real-world peer review environment. We
construct AlphaResearchComp, a new evaluation benchmark that includes an
eight open-ended algorithmic problems competition, with each problem carefully
curated and verified through executable pipelines, objective metrics, and repro-
ducibility checks. AlphaResearch gets a 2/8 win rate in head-to-head comparison
with human researchers. Notably, the algorithm discovered by AlphaResearch on
the “packing circles” problem achieves the best-of-known performance, surpass-
ing the results of human researchers and strong baselines from recent work (e.g.,
AlphaEvolve). Additionally, we conduct a comprehensive analysis of the bene-
fits and remaining challenges of autonomous research agent, providing valuable
insights for future research.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 22639
Loading