AGENT*: Optimizing Test-Time Compute for Multi-Agent Systems with Modularized Collaboration

Dongwon Jung; Peng Shi; Yi Zhang

AGENT*: Optimizing Test-Time Compute for Multi-Agent Systems with Modularized Collaboration

Dongwon Jung, Peng Shi, Yi Zhang

19 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multi-Agent System, Test-time Scaling

Abstract: Scaling test-time computation has emerged as a powerful and increasingly popular approach for improving the performance of large language models without additional training. Recent work demonstrates that techniques such as repeated sampling, self-verification, and self-reflection can significantly enhance task success by allocating more inference-time compute. However, applying these techniques directly to multi-agent systems is challenging, as they provide no principled way to encourage collaboration or manage compute allocation across multiple agents under budget constraints. To address this, we propose AGENT*, a general framework for enabling effective multi-agent collaboration while operating within strict compute budgets. AGENT* introduces the notion of \emph{modularized collaboration}, formalized as callable functions that encapsulate reusable multi-agent workflows, automatically constructed via self-play reflection by abstracting recurring interaction patterns from past trajectories. Building on these collaboration modules, AGENT* proposes \emph{a dual-level planning architecture} that optimizes compute allocation by reasoning over the current task state while also \emph{speculating} on future steps. Experiments on complex agent benchmarks demonstrate that AGENT* consistently outperforms baselines across diverse budget settings, validating its effectiveness for multi-agent collaboration in inference-time optimization.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 21185

Loading