OMNI-LEAK: Orchestrator Multi-Agent Network Induced Data Leakage

16 Sept 2025 (modified: 12 Feb 2026)ICLR 2026 Conference Desk Rejected SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multi-Agent Systems, Adversarial Robustness, Indirect Prompt Injection, Model Evaluation, LLM Agents, LLM Security
TL;DR: Introduces a novel attack vector compromising orchestrator multi-agent system safety, demonstrating single-agent safety does not generalize to multi-agent settings
Abstract: As Large Language Model (LLM) agents become more capable, their coordinated use in the form of multi-agent systems is on the rise. Prior work has examined the safety and misuse risks associated with agents. However, much of this has focused on the single-agent case and/or setups that lack basic engineering safeguards such as access control, revealing a scarcity of threat modeling in multi-agent systems. We investigate the security vulnerabilities of a popular industry multi-agent pattern known as the orchestrator setup, in which a central agent decomposes and delegates tasks to specialized agents. Through red-teaming a concrete setup representative of industry use, we demonstrate a novel attack vector, OMNI-LEAK, that compromises several agents to leak sensitive data through a single indirect prompt injection, even in the *presence of data access control*. We report the susceptibility of frontier models to different categories of attacks, finding that both reasoning and non-reasoning models are vulnerable, even when the attacker lacks insider knowledge of the implementation details. Our work highlights the failure of safety research to generalize from single-agent to multi-agent settings, indicating the serious risks of real-world privacy breaches and financial loss.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 7884
Loading