# Resource Consumption Red-Teaming for Large Vision-Language Models

## Overview

![](./main.png)

RECITE is a research project that implements resource consumption attacks on multimodal large language models (MLLMs) through adversarial image perturbations. The project demonstrates how carefully crafted adversarial images can cause MLLMs to generate repetitive outputs, leading to increased computational resource consumption and potential denial-of-service scenarios.

## Background

With the integration of vision modalities, additional attack vectors exacerbate the risk of RCAs in large vision-language models (LVLMs). 
However, existing red-teaming studies have mainly overlooked visual inputs as a potential attack surface, resulting in insufficient mitigation strategies against RCAs in LVLMs.To address this gap, we propose RECITE (\textbf{Re}source \textbf{C}onsumpt\textbf{i}on Red-\textbf{Te}aming for LVLMs), the first approach for exploiting visual modalities to trigger unbounded RCAs red-teaming.
First, we present *Vision Guided Optimization*, a fine-grained pixel-level optimization, to obtain *Output Recall* adversarial perturbations, which can induce repeating output. Then, we inject the perturbations into visual inputs, triggering unbounded generations to achieve the goal of RCAs.

### Key Features

- **Universal and Independent Attack Modes**: Supports both universal (transferable) and independent (model-specific) adversarial perturbations
- **Multiple Model Support**: Compatible with various MLLM architectures including:
  - Qwen2.5-VL series (3B, 7B, 32B)
  - LLaVA series (7B, 13B)
  - InstructBLIP series (7B, 13B)
- **PGD-based Adversarial Training**: Uses Projected Gradient Descent for generating adversarial perturbations
- **Comprehensive Evaluation**: Includes attack success rate measurement and perturbation analysis

## Installation

### Prerequisites

- Python 3.8+
- CUDA-compatible GPU (recommended)
- PyTorch 2.0+
- Transformers library

### Setup

1. **Clone the repository**:
```bash
git clone <repository-url>
cd RECITE
```

2. **Install dependencies**:
```bash
pip install -r requirements.txt
```

## Project Structure

```
RECITE_code/
├── independent/                  # Independent attack 
│   ├── multimodal_image_main.py      # Main execution script
│   ├── multimodal_image_trainer.py   # Training pipeline
│   ├── multimodal_image_generator.py # Adversarial generation
│   ├── log/                          # Training logs
│   ├── pixel_values/                 # Generated perturbations
│   └── adversarial_images/           # Output adversarial 
├── universal/                    # Universal attack 
│   ├── multimodal_image_main.py      # Main execution script
│   ├── multimodal_image_trainer.py   # Training pipeline
│   ├── multimodal_image_generator.py # Adversarial generation
│   ├── log/                          # Training logs
│   ├── pixel_values/                 # Generated perturbations
│   └── adversarial_images/           # Output adversarial 
└── data/                         # Attack datasets
    ├── diversity_image_responses_with_repeats_qwen.csv
    ├── diversity_image_responses_with_repeats_llava.csv
    ├── diversity_image_responses_with_repeats_blip.csv
    └── image/                        # Sample images
```

## Usage

### Basic Usage

1. **Independent Attack** (image-specific):
```bash
cd independent
python multimodal_image_main.py \
    --model_type qwen2vl3b \
    --attack_type token \
    --repeat_num 3 \
    --steps 1000 \
    --batch_size 1
```

2. **Universal Attack** (universal across images):
```bash
cd universal
python multimodal_image_main.py \
    --model_type qwen2vl3b \
    --attack_type sentence \
    --repeat_num 5 \
    --steps 1000 \
    --batch_size 5
```



### Parameters

- `--model_type`: Target model type (`qwen2vl3b`, `qwen2vl7b`, `qwen2vl32b`, `llava7b`, `llava13b`, `insblip7b`, `insblip13b`)
- `--attack_type`: Attack target (`token` for token repetition, `sentence` for sentence repetition)
- `--repeat_num`: Number of repetitions to target (3, 5, or 10)
- `--steps`: Number of PGD iteration steps
- `--batch_size`: Batch size for training

### Model Paths

Update the model paths in the main script according to your local setup:

```python
MODEL_PATHS = {
    'llava7b': '/path/to/llava-1.5-7b-hf',
    'llava13b': '/path/to/llava-1.5-13b-hf',
    'insblip7b': '/path/to/instructblip-vicuna-7b',
    'insblip13b': '/path/to/instructblip-vicuna-13b',
    'qwen2vl3b': '/path/to/Qwen2.5-VL-3B-Instruct',
    'qwen2vl7b': '/path/to/Qwen2.5-VL-7B-Instruct',
    'qwen2vl32b': '/path/to/Qwen2.5-VL-32B-Instruct',
}
```

### Output Files

- `adversarial_images/`: Generated adversarial images
- `log/`: Training logs and attack statistics
- `pixel_values/`: Raw perturbation values
- Evaluation results with detailed metrics


## Ethical Considerations

⚠️ **Important**: This project is for research purposes only. Users should:

- Only test on models they own or have explicit permission to test
- Follow responsible disclosure practices
- Not use for malicious purposes
- Respect terms of service of AI platforms
