Keywords: agent foundation models, agent frameworks, LLMs
TL;DR: Chain-of-Agents (CoA) distills multi-agent systems into a single end-to-end LLM paradigm, enabling efficient agent-like problem solving and achieving state-of-the-art results with open-sourced Agent Foundation Models (AFMs).
Abstract: Recent advances in large language models (LLMs) and multi-agent systems have demonstrated remarkable capabilities in complex problem-solving tasks such as deep research, vibe coding, and mathematical reasoning. However, most existing multi-agent systems are built upon manual prompt/workflow engineering with sophisticated agent frameworks, making them computationally inefficient, less capable, and unable to benefit from data-centric learning. In this work, we introduce Chain-of-Agents (CoA), a novel paradigm of LLM reasoning that enables native end-to-end complex problem-solving in the same way as a multi-agent system (i.e., multi-turn problem solving with multiple tools and multiple agents) within one model. In chain-of-agents problem-solving, the model dynamically activates different tool agents and role-playing agents to simulate multi-agent collaboration in an end-to-end fashion. To elicit end-to-end chain-of-agents problem-solving abilities in LLMs, we introduce a multi-agent distillation framework to distill state-of-the-art multi-agent systems into chain-of-agents trajectories for agentic supervised fine-tuning. We then use agentic reinforcement learning on verifiable agentic tasks to further improve the models' capabilities on chain-of-agents problem solving. We call the resulting models Chain of Agents Models (CoAMs) . Our empirical studies demonstrate that CoAM establishes new state-of-the-art performance across diverse benchmarks in both search, math, and code settings.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 5461
Loading