Exploring Performance Predictability for Multi-Agent System

Exploring Performance Predictability for Multi-Agent System

ACL ARR 2025 May Submission5542 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Multi-Agent Systems (MAS) built from Large Language Models (LLMs) offer significant potential for complex problem-solving, yet their optimal configuration is challenging, with performance typically evaluable only after resource-intensive execution. Addressing the underexplored area of MAS performance predictability, this paper investigates whether and how accurately MAS outcomes can be forecasted. We propose and evaluate a methodology that involves monitoring MAS operations during execution, capturing agent inputs and outputs, and transforming this data into system-specific statistical indicators. These indicators are then used to train a regression model to predict overall task performance. Conducting experiments across five distinct MAS architectures and three benchmark tasks, we demonstrate that MAS performance is significantly predictable, achieving Spearman rank correlations typically ranging from $\textbf{0.76}$ to $\textbf{0.94}$ between predicted and actual scores. Notably, our findings indicate that the global statistics required for these predictions can be accurately estimated from as little as 10\% of the total operational data-generating events, still yielding a high correlation of $\textbf{0.82}$. Further analysis reveals that metrics quantifying individual agent capabilities are the most influential factors in performance prediction. This work underscores the feasibility of reliably predicting MAS performance, offering a path towards more efficient design, configuration, and deployment of MASs.

Paper Type: Long

Research Area: Language Modeling

Research Area Keywords: Language Modeling, Generation, Machine Learning for NLP

Contribution Types: NLP engineering experiment, Data analysis

Languages Studied: english

Submission Number: 5542

Loading