NetSafe: Exploring the Topological Safety of Multi-agent System

NetSafe: Exploring the Topological Safety of Multi-agent System

ACL ARR 2025 February Submission1962 Authors

14 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Large language models (LLMs) have fueled significant progress in intelligent Multi-agent Systems (MAS), with expanding academic and industrial applications. However, safeguarding these systems from malicious queries receives relatively little attention, while methods for single-agent safety are challenging to transfer. In this paper, we explore MAS safety from a topological perspective, aiming at identifying structural properties that enhance security. To this end, we propose NetSafe framework, unifying diverse MAS workflows via iterative RelCom interactions to enable generalized analysis. We identify several critical phenomena for MAS under attacks (misinformation, bias, and harmful content), termed as $\textbf{\textit{Agent Hallucination}}$, $\textbf{\textit{Aggregation Safety}}$ and $\textbf{\textit{Security Bottleneck}}$. Furthermore, we verify that highly connected and larger systems are more vulnerable to adversarial spread, with task performance in a Star Graph Topology decreasing by 29.7%. In conclusion, our work introduces a new perspective on MAS safety and discovers unreported phenomena, offering insights and posing challenges to the community.

Paper Type: Long

Research Area: Ethics, Bias, and Fairness

Research Area Keywords: model bias evaluation, ethical considerations in NLP applications, reflections and critiques

Contribution Types: Model analysis & interpretability, NLP engineering experiment, Reproduction study, Data resources, Data analysis

Languages Studied: English

Submission Number: 1962

Loading