Rethinking Scale: Deployment Trade-offs of Small Language Models under Agent Paradigms

Xinlin Wang

Rethinking Scale: Deployment Trade-offs of Small Language Models under Agent Paradigms

Xinlin Wang

Published: 18 Apr 2026, Last Modified: 24 Apr 2026ACL 2026 Industry Track PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Small Language Models (SLMs), Agent Paradigms, Tool-Augmented Reasoning, Multi-Agent Systems

Abstract: Despite the impressive capabilities of large language models, their substantial computational costs, latency, and privacy risks hinder their widespread deployment in real-world applications. Small Language Models (SLMs) with fewer than 10 billion parameters present a promising alternative; however, their inherent limitations in knowledge and reasoning curtail their effectiveness. Existing research primarily focuses on enhancing SLMs through scaling laws or fine-tuning strategies while overlooking the potential of using agent paradigms, such as tool use and multi-agent collaboration, to systematically compensate for the inherent weaknesses of small models. To address this gap, this paper presents the first large-scale, comprehensive study of <10B open-source models under three paradigms: (1) the base model, (2) a single agent equipped with tools, and (3) a routing-based multi-agent system with collaborative capabilities. Our results show that structured agent frameworks (combining step-by-step reasoning and tool use) substantially improve effectiveness over direct prompting, with single-agent systems achieving the best balance between performance and cost. In contrast, routing-based multi-agent setups introduce additional coordination overhead with limited gains under small-model constraints. Our findings highlight the importance of agent-centric design for efficient and trustworthy deployment in resource-constrained settings.

Submission Type: Emerging

Copyright Form: pdf

Submission Number: 429

Loading