EvoPlan: Agent-driven Evolutionary Planning for LLM Reasoning

EvoPlan: Agent-driven Evolutionary Planning for LLM Reasoning

ICLR 2026 Conference Submission12732 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Agents, Planning, Evolutionary Search, Reasoning

TL;DR: EvoPlan replaces costly execution-based rollouts with lightweight LLM critics that score complete plans, enabling evolutionary (MCTS-style) plan search that achieves higher accuracy with dramatically lower compute.

Abstract: Efficiently generating high-quality plans is a critical yet unsolved challenge for Large Language Model (LLM) agents tackling complex reasoning tasks. Prevailing search-based planners, such as those employing Monte Carlo Tree Search (MCTS) or exploring a tree-of-thoughts, are fundamentally bottlenecked by their reliance on costly, execution-based rollouts to evaluate partial solutions, leading to prohibitive computational overhead. We introduce a novel agentic planning paradigm that circumvents this limitation by replacing expensive execution with efficient, static evaluation. Our framework employs a duo of specialized LLM critics: a Logical Consistency Agent to scrutinize a plan's internal coherence and a Feasibility Agent to assess its practical executability. These critics provide rich, multi-faceted feedback that guides a novel evolutionary search algorithm, which iteratively refines complete candidate plans toward global optimality. On diverse mathematical reasoning benchmarks (e.g., GSM8K, AIME), our approach surpasses vanilla MCTS by +8.72pp while using 90% less GPU time, and outperforms LLM-based search by +7.66pp with 30% fewer search steps. Our work demonstrates that decoupling plan evaluation from execution through specialized agentic critics enables a more scalable and effective paradigm for LLM-based planning and reasoning.

Supplementary Material: zip

Primary Area: causal reasoning

Submission Number: 12732

Loading