# TooBad

Code Repo for the paper "TooBad: Backdoor Diffusion Models with Ultra-Low Poison Rate and Imperceptible Trigger"

## Setup Environment

- Python: 3.8.5
- PyTorch: 2.4.1+cu121

Please create a virtual environment as follows:

```bash
conda create -n TooBad python=3.8.5
conda activate TooBad
conda install pytorch=2.4.1 cudatoolkit=11.8 -c pytorch
```

## Install Required Packages

Please run the following command to install packages:

```bash
bash install.sh
```

## Backdoor Diffusion Models with TooBad

### Trigger Optimization

You can generate a trigger for a DDPM pretrained on CIFAR-10, using HAT as backdoor target as follows:

```bash
python trigger_optimization_DDPM.py --target HAT --ckpt google/ddpm-cifar10-32 --num_epoch 50 --batch_size 32 --learning_rate 0.3
```

Or you can generate triggers for other models and backdoor targets by changing the arguments:

- `--target`: Name of the backdoor target such as `SHOE`.
- `--ckpt`: The pretrained models. If this argument is not declared, it will automatically download default models from huggingface.
- `--num_epoch`: Total number of training epochs. Can be adjusted to offer higher performance.
- `--batch_size`: Training batch size.
- `--learning_rate`: Learning rate for trigger optimization.

### Backdoor Injection

After generating the trigger, we use inject the generated trigger to the diffusion models.

For example, if we want to backdoor a DDPM trained on CIFAR10 with target "HAT" and 5% poison rate, please run:

```bash
python backdoor_injection.py --project default --mode train --dataset CIFAR10 --batch 128 --epoch 50 --poison_rate 0.05 --trigger TooBad_DDPM_CIFAR_10_HAT --target HAT --ckpt DDPM-CIFAR10-32 --fclip o -o --gpu 0
```

If we want to backdoor a NCSN trained on CIFAR10 with target "CAT" and poison rate 40%, we can use:

```bash
python backdoor_injection.py --postfix flex_new-set --project default --mode train --learning_rate 2e-05 --dataset CIFAR10 --sde_type SDE-VE --batch 128 --epoch 40 --clean_rate 1.0 --poison_rate 0.4 --dataset_load_mode FIXED --trigger TooBad_NCSN_CIFAR_10_CAT --target CAT --solver_type sde --psi 0 --vp_scale 1.0 --ve_scale 1.0 --ckpt FrankCCCCC/NCSN_CIFAR10_my --fclip o --save_image_epochs 2 --save_model_epochs 5 -o --R_trigger_only --gpu 0
```

## Evaluation

We provide a file `automatic_evaluation.py` for automatic evaluation with the following arguments:

- `--eval_mode`: Indicate evaluation mode.
  - `across_poison_rates`: Evaluate multiple models within a designated poison rate range.
  - `specific_poison_rate`: Evaluate a single model at a specific poison rate, analysing the performance over epochs.
- `--poison_rate_min`: Minimum poison rate. Only use in `across_poison_rates` mode.
- `--poison_rate_max`: Maximum poison rate. Only use in `across_poison_rates` mode.
- `--poison_rate_specific`: Analyze models with this poison rate. Only use in `specific_poison_rate` mode.
- `--target`: Backdoor target of the model we want to evaluate.
- `--model_type`: The model that we're looking for is DDPM or NCSN.
- `--trigger`: The model is backdoored with what trigger? Choose `TooBad` if you wanna evaluate our generated trigger. Otherwise, indicate the name of the trigger such as `STOP_SIGN_14`.
- `--recompute`: True if you want to make new evaluation each run time.

### Evaluate multiple models

For example, if you want to evaluate multiple DDPMs backdoored by TooBad, using target "HAT" and the poison rates range within 0.2%-10%, please run:

```bash
python automatic_evaluation.py --eval_mode across_poison_rates --poison_rate_min 0.002 --poison_rate_max 0.005 --target HAT --model_type DDPM --dataset_name CIFAR_10 --trigger TooBad
```

This will report ASR, MSE, and SSIM of these models similarly to the following dataframe:

```bash
         trigger          target  poison_rate  best_epoch   ASR       MSE      SSIM
0       TooBad_DDPM_HAT    HAT        0.001         1      0.02    0.097906  0.097846
1       TooBad_DDPM_HAT    HAT        0.002        91      0.32    0.077766  0.379909
2       TooBad_DDPM_HAT    HAT        0.003        97      0.57    0.049155  0.605643
3       TooBad_DDPM_HAT    HAT        0.004        91      0.59    0.048803  0.614588
4       TooBad_DDPM_HAT    HAT        0.005        33      0.76    0.027558  0.753949
...
```

### Evaluate a single model through different epochs

If you want to observe the performance of the attack across different training epochs of one model that is attacked with a specific poison rate, you can run the following example:

```bash
python automatic_evaluation.py --eval_mode specific_poison_rate --poison_rate_specific 0.05 --target HAT --model_type DDPM --dataset_name CIFAR_10 --trigger TooBad
```

This will result in something like:

```bash
    epoch   ASR       MSE      SSIM
0     0    0.00    0.092717  0.108453
1     1    0.07    0.067950  0.388008
2     3    0.97    0.009132  0.896584
3     5    0.97    0.006304  0.935276
4     7    0.91    0.009368  0.896942
5     9    0.90    0.012194  0.878657
6    11    0.95    0.006854  0.930661
7    13    0.99    0.002570  0.968714
...
```

### Reproduct the results presented in the paper

To facilitate the reproduction of the paper's results, we provide a file that lets you run all models (~40 models) at once, but it might take ~2 days to complete:

```bash
bash run_automatic_training.sh
```

## Acknowledgement

As one stage of this code (backdoor injection) is based on VillanDiffusion, we would refer to their implementation: https://github.com/IBM/VillanDiffusion/tree/main

## Reference
