Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems

Shaokun Zhang; Ming Yin; Jieyu Zhang; Jiale Liu; Zhiguang Han; Jingyang Zhang; Beibin Li; Chi Wang; Huazheng Wang; Yiran Chen; Qingyun Wu

Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems

Shaokun Zhang, Ming Yin, Jieyu Zhang, Jiale Liu, Zhiguang Han, Jingyang Zhang, Beibin Li, Chi Wang, Huazheng Wang, Yiran Chen, Qingyun Wu

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 spotlightposterEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Failure attribution in LLM multi-agent systems—identifying the agent and step responsible for task failures—provides crucial clues for systems debugging but remains underexplored and labor-intensive. In this paper, we propose and formulate a new research area: automated failure attribution for LLM multi-agent systems. To support this initiative, we introduce the Who\&When dataset, comprising extensive failure logs from 127 LLM multi-agent systems with fine-grained annotations linking failures to specific agents and decisive error steps. Using the Who\&When, we develop and evaluate three automated failure attribution methods, summarizing their corresponding pros and cons. The best method achieves 53.5\% accuracy in identifying failure-responsible agents but only 14.2\% in pinpointing failure steps, with some methods performing below random. Even SOTA reasoning models, such as OpenAI o1 and DeepSeek R1, fail to achieve practical usability. These results highlight the task's complexity and the need for further research in this area. Code and dataset are available in https://github.com/mingyin1/Agents_Failure_Attribution.

Lay Summary: When teams of AI agents work together to solve a task, it's often hard to figure out what went wrong if they fail. Specifically, it's difficult to identify which agent made the mistake and at what point in the process it happened. This is an important problem for improving how these systems work, but it's rarely studied and very time-consuming to do by hand. In our research, we introduce a new area of study focused on automatically finding the causes of failure in multi-agent AI systems. We created a dataset called Who&When, which includes detailed records of failures from 127 agentic systems, clearly showing which agent made the error and when. We also tested several methods to automatically detect these failures. Our findings show this is a tough problem that needs more research.

Link To Code: https://github.com/mingyin1/Agents_Failure_Attribution

Primary Area: Deep Learning->Large Language Models

Keywords: failure attribution, multi-agent systems.

Submission Number: 425

Loading