Breaking MCP with Function Hijacking Attacks: Novel Threats for Function Calling and Agentic Models

Breaking MCP with Function Hijacking Attacks: Novel Threats for Function Calling and Agentic Models

ICLR 2026 Conference Submission17235 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Jailbreaking, Hijacking, Agentic AI, Function Calling Models, LLMs, Security

TL;DR: This paper introduces the first function hijacking attack that manipulates the tool selection process of agentic models to force the invocation of a specific, attacker-chosen function.

Abstract: The growth of agentic AI has drawn significant attention to function calling Large Language Models (LLMs), which are designed to extend the capabilities of AI-powered system by invoking external functions. Injection and jailbreaking attacks have been extensively explored to showcase the vulnerabilities of LLMs to user prompt manipulation. The expanded capabilities of agentic models introduce further vulnerabilities via their function calling interface. Recent work in LLM security showed that function calling can be abused, leading to data tampering and theft, causing disruptive behavior such as endless loops, or causing LLMs to produce harmful content in the style of jailbreaking attacks. This paper introduces the first function hijacking attack that manipulates the tool selection process of agentic models to force the invocation of a specific, attacker-chosen function. We conducted experiments on 3 different models, reaching 80% to 98% ASR over the established BFCL dataset. We also introduce FunSecBench, an extension of the BFCL dataset to assess the vulnerability of function calling models to the triggering of attacker-selected functions. Our findings further demonstrate the need for strong guardrails and security modules for agentic systems.

Supplementary Material: zip

Primary Area: foundation or frontier models, including LLMs

Submission Number: 17235

Loading