Diversity for The Win: Towards Building Multi-Agent Systems with Heterogeneous LLMs

Rui Ye; Xiangrui Liu; Xianghe Pang; Qimin Wu; Zhenfei Yin; LEI BAI; Siheng Chen

Diversity for The Win: Towards Building Multi-Agent Systems with Heterogeneous LLMs

Rui Ye, Xiangrui Liu, Xianghe Pang, Qimin Wu, Zhenfei Yin, LEI BAI, Siheng Chen

19 Sept 2025 (modified: 09 Jan 2026)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM, Multi-Agent Systems

Abstract: LLM-based multi-agent systems (MAS) extend the capabilities of single LLMs by enabling cooperation among multiple specialized agents. However, most existing MAS frameworks rely on a single LLM to drive all agents, constraining the system's intelligence to the limitations of that model. This paper explores the paradigm of heterogeneous LLM-driven MAS, aiming to elevate the system's potential to the collective intelligence of diverse LLMs. We introduce X-MAS-Bench, a comprehensive testbed designed to evaluate the performance of various LLMs across different domains and MAS-related functions. Through an extensive empirical study, we assess 28 LLMs across 5 domains (encompassing 21 test sets) and 5 functions, conducting over 1.7 million evaluations to identify optimal model selections for each domain-function combination. Building on these findings, we demonstrate how transitioning from homogeneous to heterogeneous LLM-driven MAS can significantly enhance system performance without requiring structural redesign. Specifically, in a chatbot-only MAS scenario, the heterogeneous configuration yields up to 6.4% performance improvement for MAS methods on the MATH dataset. In a mixed chatbot-reasoner scenario, the heterogeneous MAS achieves up to 47% performance boost on the AIME dataset. Our results underscore the transformative potential of heterogeneous LLMs in MAS, highlighting a promising direction for future research in scalable, collaborative AI systems.

Primary Area: datasets and benchmarks

Submission Number: 18118

Loading