Rethinking the Value of Multi-Agent Workflow: A Strong Single Agent Baseline

ICLR 2026 Conference Submission14377 Authors

18 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Large Language Models, Multi-Agent Systems, KV Cache, Automatic Workflow Design
Abstract: Recent advances in LLM-based multi-agent systems (MAS) show that workflows composed of multiple LLM agents with distinct roles, tools, and communication patterns can outperform single-LLM baselines on complex tasks. However, most frameworks are homogeneous, where all agents share the same base LLM and differ only in prompts, tools, and positions in the workflow. This raises the question of whether such workflows can be simulated by a single agent through multi-turn conversations. We investigate this across six benchmarks spanning coding, mathematics, and general question answering. Our results show that a single agent can reach the performance of homogeneous workflows with an efficiency advantage from KV cache reuse, and can even outperform an automatically optimized heterogeneous workflow. Building on this finding, we propose $\textbf{OneFlow}$, an algorithm that automatically tailors workflows for single-agent execution, reducing inference costs compared to existing automatic multi-agent design frameworks without trading off accuracy. These results position the single-LLM implementation of multi-agent workflows as a strong baseline for MAS research. We also note that single-LLM methods cannot capture heterogeneous workflows due to the lack of KV cache sharing across different LLMs, highlighting future opportunities in developing $\textit{truly}$ heterogeneous multi-agent systems.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 14377
Loading