Confidence-Guided MCTS for Efficient Long-Horizon Web Agent Tasks

ICLR 2026 Conference Submission17552 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: AI Agent, Tree Search, Internal Feedback
Abstract: LLM agents that solve long-horizon tasks on the web often rely on Monte Carlo Tree Search (MCTS) to plan and reason over extended trajectories. While effective, standard MCTS requires wide branching and repeated value evaluations, making it very compute-intensive. We introduce confidence-guided MCTS, a method that uses internal certainty signals from the model’s own log-probabilities to efficiently allocate search power to MCTS. The guided MCTS enables adaptive branching that adjusts the width of the tree depending on how confident the model is, reducing expansion when predictions are already decisive and vice versa. Our paper also includes multiple variants for integrating confidence into tree search; variants like weighted backpropagation incorporate certainty directly into value updates, amplifying reliable rollouts and reducing the impact of noisy ones. The method demonstrates that lightweight internal signals can guide search more effectively, reducing inference computation while preserving or even improving success on complex long-horizon tasks, moving closer to the Pareto frontier. Confidence-guided MCTS highlights a simple but powerful direction: using the model’s own certainty to make search-augmented agents more efficient without extra supervision.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 17552
Loading