Wider or Deeper? Scaling LLM Inference-Time Compute with Adaptive Branching Tree Search

Published: 06 Mar 2025, Last Modified: 25 Mar 2025ICLR 2025 FM-Wild WorkshopEveryoneRevisionsBibTeXCC BY 4.0
Keywords: large language models, inference-time scaling, monte carlo tree search, exploration-exploitation, code generation
TL;DR: We propose AB-MCTS, which dynamically balances exploring new solutions (“go wider”) and refining promising ones (“go deeper”) to boost inference-time performance of LLMs on complex tasks.
Abstract:

Recent advances demonstrate that increasing inference-time computation can significantly boost the reasoning capabilities of large language models (LLMs). Although repeated sampling (i.e., generating multiple candidate outputs) is a highly effective strategy, it falls short when external feedback is available to guide response selection and refinement. In this work, we propose $\textit{Adaptive Branching Monte Carlo Tree Search (AB-MCTS)}$, a novel inference-time framework that unifies repeated sampling with principled multi-turn exploration and exploitation. At each node in the search tree, AB-MCTS dynamically decides whether to "go wider" by expanding new candidate responses or "go deeper" by revisiting existing ones based on external feedback signals. We evaluate our method on complex coding and engineering tasks using frontier API models. Empirical results show that AB-MCTS consistently outperforms both repeated sampling and standard MCTS, underscoring the importance of combining the response diversity of LLMs with multi-turn solution refinement for effective inference-time scaling.

Submission Number: 61
Loading