Optimizable LLM Planning: A Branch-and-Bound Framework for Complex Tasks

Peter Baile Chen; Yi Zhang; Weiyue Li; Mike Cafarella; Samuel Madden; Jacob Andreas; Dan Roth

Optimizable LLM Planning: A Branch-and-Bound Framework for Complex Tasks

Peter Baile Chen, Yi Zhang, Weiyue Li, Mike Cafarella, Samuel Madden, Jacob Andreas, Dan Roth

19 Sept 2025 (modified: 04 Dec 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: planning, branch and bound, reasoning

Abstract: Optimizing reasoning and action planning with LLMs under limited computational budgets is a fundamental problem in solving complex tasks that require multi-step decomposition. Existing approaches, including greedy step-by-step reasoning and tree-based search, remain largely budget-blind. For these methods, budget is enforced by ad-hoc stopping rules rather than treated as explicit optimization objectives, which prevents them from adapting the depth and breadth of reasoning to different levels of budget. To address this, we introduce **O**ptimizable **L**LM **P**lanning (OLP), a branch-and-bound framework that formulates planning as a budgeted optimization problem for task success. At each step, each candidate plan is expanded by decomposing the task into an immediately solvable subtask and a residual subtask. For the residual, the planner estimates lower and upper bounds on utility calibrated from reward and cost signals, where the reward model is adaptable to different execution operators (e.g., retrieval, LLM reasoning). This calibration enforces budget feasibility and supports principled ranking of candidate plans. The bound-guided search avoids unrolling entire trajectories, focuses exploration on candidates whose upper bounds dominate, and prunes branches whose upper bounds fall below competing lower bounds, enabling effective exploration of both depth and breadth under budget constraints. We instantiate this general framework for retrieval-augmented generation (RAG) problems that require reasoning. Across multiple benchmarks, our framework achieves higher accuracy than strong agentic baselines using different search algorithms while substantially reducing computation, demonstrating the effectiveness of making planning explicitly optimizable under budget constraints.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 15061

Loading