# MCPSecBench

It includes ![MCPSecBench](mcpbench) and ![data](data) in our experiment.

## Overview of MCPSecBench

- main.py: an automated testing script including part attacks.
- addserver.py: normal server for computation.
- maliciousadd.py: malicious server.
- download.py: a normal server for checking signature.
- squatting.py: a malicious server for server name squatting.
- client.py: client that connect with MCP host and server. At present, it support OpenAI and Claude. It can be extended for Deepseek, Llama, and QWen.
- mitm.py: the script that implements Man-in-the-Middle attack.
- index.js: the script for DNS rebinding attack.
- cve-2025-6541.py: a malicious server to trigger CVE-2025-6541.
- claude_desktop_config.json: the configuration for Claude Desktop.
- prompts: example prompts for testing.
- results: only for openai at present.

## Set up MCPSecBench

*needs: python version higher than 3.10*

- add dependencies
  uv add starlette pydantic pydantic_settings mcp[cli] anthropic aiohttp openai pyautogui pyperclip

  **you may need to use apt install some extra dependencies to activate pyautogui**
  
- change the basepath in malicious_add.py to you real path

- for tool name squatting and server name squatting in Claude. Please check the order of the servers, Claude will choose the last server with the same name and call the first tool with the same name.

## How to use MCPSecBench

### Test Script
The auto check only supports OpenAI at present.

- set API_Key. export OPENAI_API_KEY xxxx / export ANTHROPIC_API_KEY xxx

- uv run main.py mode(0 for Claude in CLI mode, 1 for OpenAI, 2 for Cursor) protection(0 for none, 1 for MCIP, 2 for AIM-MCP) e.g. uv run main.py 1 2

**Delete /tmp/state.json at first.**
**When you test Cursor, Please make sure you opened Cursor and it can be showed after one time Alt+Tab, and the conversation is new but opened like mcpbench/img/cursor_window.png**
### Testing LLM models and MCP servers with own MCP client

- First launch all remote servers. For example: uv run download.py
- set API_Key. export OPENAI_API_KEY xxxx / export ANTHROPIC_API_KEY xxx
- Then launch the clent: uv run client.py mode(0, 1). 0 for claude, 1 for openai. 
- In the end, interactive with LLM model 

### Testing Claude-Desktop

- First copy the content of claude_desktop_config.json to your claude_desktop_config.json, change the directory to your path.
- Launch all remote servers. For example: uv run download.py
- Test by Claude-Desktop

### Testing Cursor

- Copy the content of cursor_config.json to Cursor configuration, change the directory to your path.
- Launch all remote servers. For example: uv run download.py
- Test by Cursor manually or via main.py

## Experiment Results

Experiments Results are shown in ![data](data) folder.

<!--
- Tool Poison Attacks

  Claude:

  ![toolp-claude](img/toolp-claude.png)


  OpenAI: 

  ![toolp-openai](img/toolp-openai.png)
  
  Cursor:

  ![toolp-cursor](img/toolp-cursor.png)


- Tool Shadowing Attacks

  Claude:

  ![tools-claude](img/tools-claude.png)


  OpenAI:

  ![tools-openai](img/tools-openai.png)

  Cursor:

  ![tools-cursor](img/tools-cursor.png)
  
  ![tools-cursor2](img/tools-cursor2.png)

- Data Exfiltration

  Claude:

  ![data-claude](img/data-claude.png)

  OpenAI：

  ![data-openai](img/data-openai.png)
  ![data-openai2](img/data-openai2.png)


  Cursor:

  ![data-cursor](img/data-cursor.png)


- Prompt Injection

  Claude(failed):

  ![prompt-claude](img/prompt-claude.png)

  OpenAI:

  ![prompt-openai](img/prompt-openai.png)
  
  Cursor:

  ![prompt-cursor](img/prompt-cursor.png)


- Slash Command Overlap

  Cursor:

  ![slash](img/slash.png)

- Rug Pull

  Claude:

  ![rug-claude](img/rug-claude.png)
  
  OpenAI:
  
  ![rug-openai](img/rug-openai.png)

  Cursor:

  ![rug-cursor](img/rug-cursor.png)

  
- Indirect Prompt Injection

  Claude:

  ![indirect-claude](img/indirect-claude.png)

  OpenAI:

  ![indirect-openai](img/indirect-openai.png)
  
  Cursor:

  ![indirect-cursor](img/indirect-cursor.png)

  - Privilege Escalation (indirect prompt injection)


- Package Name Squatting(server name)

  Claude:

  ![server-claude](img/server-claude.png)


  OpenAI:
  
  ![server-openai](img/server-openai.png)

  Cursor:

  ![server-cursor](img/server-cursor.png)
  ![server-cursor2](img/server-cursor2.png)

- Package Name Squatting(tool name)

  Claude:

  ![tool-claude](img/tool-claude.png)

  OpenAI:

  ![tool-openai](img/tool-openai.png)
  
  Cursor:

  ![tool-cursor](img/tool-cursor.png)
  ![tool-cursor2](img/tool-cursor2.png)
  
- Sandbox Escape

  Claude:

  ![sand-claude](img/sand-claude.png)

  OpenAI:

  ![sand-openai](img/sand-openai.png)

  Cursor:

  ![sand-cursor](img/sand-cursor.png)

- Tool/Service Misuse via “Confused AI”

  Claude:

  ![misuse-claude](img/misuse-claude.png)


  OpenAI:

  ![misuse-openai](img/misuse-openai.png)

  Cursor:

  ![misuse-cursor](img/misuse-cursor.png)


- MITM

  Claude:

  ![mitm-claude](img/mitm-claude.png)

  OpenAI:

  ![mitm-openai](img/mitm-openai.png)
  
  Cursor:

  ![mitm-cursor](img/mitm-cursor.png)
  
- DNS rebinding

  Claude:

  ![dns-claude](img/dns-claude.png)

  OpenAI:

  ![dns-openai](img/dns-openai.png)
  
  Cursor(be aware that no proxy is set):

  ![dns-cursor](img/dns-cursor.png)

- Vulnerable server

  Claude:

  ![vulnerable-claude](img/vulnerable-claude.png)

  OpenAI:

  ![vulnerable-openai](img/vulnerable-openai.png)
  
  Cursor:

  ![vulnerable-cursor](img/vulnerable-cursor.png)

- Vulnerable client(works on Windows)

  Claude:

  ![client-claude](img/client-claude.png)

  OpenAI:

  ![client-openai](img/client-openai.png)

  Cursor:
  
  ![client-cursor](img/client-cursor.png)

- Configuration Drift

  Claude:

  ![conf-claude](img/conf-claude.png)
  
  OpenAI:

  ![conf-openai](img/conf-openai.png)

  Cursor:

  ![conf-cursor](img/conf-cursor.png)


- Schema inconsistencies

  Claude:

  ![schema-claude](img/schema-claude.png)

  OpenAI:

  ![schema-openai](img/schema-openai.png)
  
  Cursor:

  ![schema-cursor](img/schema-cursor.png)

  
-->
