Beyond Generalist LLMs: Specialist Agentic Systems For Structured Code Workflow Execution

Beyond Generalist LLMs: Specialist Agentic Systems For Structured Code Workflow Execution

ICLR 2026 Conference Submission15998 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM Agents, Automatic Creation of Agentic Systems, Efficient Workflow Generation

TL;DR: An exploration of cost/efficiency/performance benefits that can be obtained by using bespoke task specific structured LLM agentic workflows for code generation as opposed to generalist LLM coding agent solutions.

Abstract: The rise of Large Language Models (LLMs) has accelerated the adoption of software development agents, now commonly found as IDE extensions and standalone applications. These agents enable users with minimal programming experience to build complete applications in minutes. Typically designed as generalists, they leverage the broad capabilities of LLMs to perform a wide range of tasks. This versatility raises a key question: do specialist agents offer meaningful advantages over generalist ones, particularly given the additional development effort they require? To explore this question empirically, we focus on business process automation specifically, the transformation of tasks defined in Business Process Model and Notation (BPMN) diagrams into executable agentic workflows. We introduce a specialist workflow tailored for this purpose and evaluate its performance against generalist solutions. Our findings show that, in this context, the specialist agentic solution produces agents that outperform those generated by generalist agents such as Roo and Cline by 2.75% in accurate task completion, while reducing the token cost of agent generation by 96%. Additionally, we identify several limitations in generalist agents, including inconsistent code generation in terms of both functionality and quality. These inconsistencies hinder their applicability in industrial settings, where reliability and maintainability are critical for large-scale adoption.

Primary Area: applications to robotics, autonomy, planning

Submission Number: 15998

Loading