Keywords: LLMs, code generation, program search, random search, baseline
Abstract: Difficult coding problems are often solved by prompting large language models to generate programs and iterate on their code until they find a solution. Many works have proposed ways to guide this iterative process but often do not compare to simpler baselines. Taking nine problems from the AlphaEvolve paper as case studies, we find that randomly sampling programs using an LLM works well, matching AlphaEvolve on two and matching or improving over a strong open source baseline, ShinkaEvolve, on eight. This implies that some improvements may stem not from the LLM-driven program search but due to the manual formulation that makes the problems easily optimizable.
Submission Number: 42
Loading