# Prediction Inconsistency Helps Achieve Generalizable Detection of Adversarial Examples

A PyTorch implementation of Prediction Inconsistency Detector (PID). 

## Environment of PID

```bash
conda create -n myenv python=3.8
conda activate myenv
pip install -r requirements.txt
```

## Data and Pre-trained model

### CIFAR-10
- Place datasets in `./datasets/CIFAR10/`
- **Training Scripts**:
  - Natural training: `classifier.py`
  - Adversarial training: `classifier_pgd.py`
    Example: Adversarially train the VGG19 model on CIFAR-10
    ```bash
    python classifier_pgd.py --exp_name vgg19_adv \
                            --modelname vgg19_adv \
                            --datadir datasets \
                            --savedir results/CIFAR10/saved_model \
                            --dataname CIFAR10 \
                            --epochs 200 \
                            --lr 0.005 \
                            --batch_size 128
    ```

### ImageNet
- Store datasets in model-specific subdirectories under `./datasets/ImageNet/` (e.g., `./datasets/ImageNet/resnet50` for images that can be correctly classified by ResNet50).
- Download pre-trained models to `./results/ImageNet/saved_model/`

Detailed information of the employed models on each dataset can be found in Appendix.

## Generate Adversarial Examples

Supported attacks:
- White-box: PGD, C\&W, DeepFool
- Black-box: TA, Square, VNI-FGSM
- Mixed: AutoAttack (AA)

Example: Generate PGD attacks against a natural VGG19 model on CIFAR-10:
    ```bash
    python adv_samples.py --exp_name PGD \
                        --modelname vgg19 \
                        --adv_name PGD \
                        --adv_method PGD \
                        --adv_config configs_adv \
                        --datadir datasets \
                        --savedir results/CIFAR10/saved_adv_samples \
                        --dataname CIFAR10
    ```

Attack configurations are stored in `./adv_configs/` directory and can be modified as needed. 
Note that the TA parameters are directly specified in `adv_samples.py`.

## Detect Adversarial Examples
In this work, we employ three types of models as the auxiliary model in the PID:
- Adversarially trained CNN
- ViT
- CLIP

You can invoke the corresponding detection scripts:
- detection_cnn.py
- detection_vit.py
- detection_clip.py

Example: to use ViT-L/16 as the auxiliary model on ImageNet to detect the PGD attack:
    ```bash
    python detection_vit.py --exp_name PGD \
                        --modelname resnet50 \
                        --savedir results/ImageNet/saved_adv_samples \
                        --dataname ImageNet \
                        --num_classes 1000 \
                        --batch_size 16
    ```

