Keywords: Multi-agent System (MAS), control flow hijacking, Model Context Protocol (MCP)
Abstract: Multi-Agent Systems (MAS) excel at complex problem-solving tasks by orchestrating specialized agents through the control flow. Agents are empowered by external APIs, accessed via the Model Context Protocol (MCP) which standardizes the interaction between Large Language Models (LLMs) and API services, harnessing the MCP server with three primitives---tools, resources, and prompts. However, the widespread adoption of MCP introduces a critical vulnerability in MAS frameworks: A greedy service provider is highly motivated to deploy a malicious MCP server designed to surreptitiously inflate API usage, thereby draining a user's pre-paid account. We introduce *Phantom*, a framework that generates such malicious MCP servers. *Phantom* executes a novel attack that hijacks the MAS control flow to repeatedly activate a targeted agent and compels it to make excessive and redundant API calls. Crucially, the attack preserves the overall MAS utility in task execution, and evades exception detection mechanisms deployed by frameworks, ensuring its stealth and persistence. Extensive evaluation across three multi-agent tasks, four leading industrial frameworks, and three state-of-the-art LLMs shows that *Phantom* effectively increases targeted API invocations by up to **26×** while maintaining an average attack success rate of **98%**. Furthermore, it demonstrates remarkable resilience, defeating six distinct mitigation with **94\%** average success rate. This work uncovers a severe, real-world threat to the MAS ecosystem and highlights the urgent need for new security paradigms.
Primary Area: alignment, fairness, safety, privacy, and societal considerations
Submission Number: 4728
Loading