Scaling Behavior of Single LLM-Driven Multi-Agent Systems

Scaling Behavior of Single LLM-Driven Multi-Agent Systems

ACL ARR 2026 January Submission3465 Authors

04 Jan 2026 (modified: 20 Mar 2026)ACL ARR 2026 January SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: large language models, LLM agents, multi-agent systems, scaling laws, collaborative intelligence, agent collaboration, emergent behavior, coordination overhead, diminishing returns, sequential communication, collective reasoning

Abstract: The burgeoning field of LLM-based Multi-Agent Systems (MAS) promises to tackle complex tasks through collaborative intelligence, yet fundamental questions regarding their scaling behavior and intrinsic collective dynamics remain underexplored. This paper systematically investigates how the performance of a homogeneous MAS evolves as the number of agents increases, isolating the variable of collaboration from model or knowledge heterogeneity. We propose the Sequential Iterative Multi-Agent System (SIMAS) framework, a minimalist architecture centered on sequential inter-agent communication, to clearly observe scaling effects. Through extensive experiments across diverse tasks and model scales, we establish that MAS performance does not scale monotonically with agent count but follows a pattern of diminishing returns, governed by a trade-off between collaborative synergy and coordination overhead. Our findings reveal that effective MAS requires a sufficiently capable base LLM, that task type critically modulates the optimal agent count, and that collective intelligence is an emergent property contingent on strategic interaction design rather than a guaranteed outcome of agent plurality. This work provides a foundational understanding of MAS scaling laws, offering practical guidance for designing efficient collaborative systems and challenging the prevailing assumption that more agents invariably lead to better performance.

Paper Type: Long

Research Area: AI/LLM Agents

Research Area Keywords: Language Modeling, Efficient/Low-Resource Methods for NLP, Generation

Contribution Types: Model analysis & interpretability

Languages Studied: English

Submission Number: 3465

Loading