MACS CODER: A Multi-Agent Coding Framework for Small LMs — From Fast Thinking to Deep Planning

12 Sept 2025 (modified: 03 Dec 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multi-Agent Systems, Code Generation, Large Language Models (LLMs), Test-time Compute
TL;DR: Efficient multi-agent framework for coding.
Abstract: Large Language Models (LLMs) have made significant strides in code generation, yet solving complex programming tasks remains a major challenge. Current state-of-the-art (SOTA) multi-agent frameworks, while powerful, often use a resource-intensive, one-size-fits-all strategy. We introduce MACS-Coder (Multi-Agent Adaptive Coding Structure), a novel dual-process framework designed for high efficiency on personal computers (macOS and Windows PCs). Inspired by human cognition, it comprises two systems: a Fast Thinking System for rapid, low-cost code generation, and a Deep Planning System for methodical, deliberative problem-solving. This dual architecture allows small models to achieve performance comparable to much larger proprietary models while consuming far less energy and producing lower CO$_2$ emissions. MACS-Coder dynamically adapts its strategy, employing its Fast Thinking System for simpler tasks and activating its Deep Planning System---composed of planning, structured templating, and fine-grained debugging agents---for complex challenges. Extensive experiments across multiple benchmarks, including the highly challenging LiveCodeBench, demonstrate that MACS-Coder achieves new SOTA pass@1 results. Using the gpt-oss-20B model, it attains accuracies of 99.4% on HumanEval, 93.2% on MBPP, and 83.2% on LiveCodeBench V5, consistently outperforming prior methods such as CodeSIM and MapCoder in both accuracy and computational efficiency. When scaled to a larger open-source backbone (e.g., gpt-oss-120B), MACS-Coder achieves SOTA performance on live-coding benchmarks, surpassing earlier SOTA models. The primary contribution of our work is to bridge the performance gap between compact open-source models and elite closed-source systems: we show that an open-source gpt-oss-20B model empowered by MACS-Coder can achieve performance comparable to top-tier models such as o4-Mini (High) and Gemini 2.5 Pro. By making SOTA-level performance more accessible and resource-efficient, MACS-Coder represents a significant step toward democratizing advanced AI-assisted programming. We will open-source the framework and evaluation code to facilitate future research. Explore our code and video demo at https://anonymous.4open.science/r/MACS-Coder-B161HIIRqq0023.
Primary Area: applications to computer vision, audio, language, and other modalities
Submission Number: 4363
Loading