# Towards the Detection of Diffusion Model Deepfakes

## Setup
Create a virtual environment with python>=3.8, install a PyTorch version matching your system and run `pip install -r requirements.txt`.

## Dataset
To recreate the dataset, follow the instructions provided in Appendix A.
You may use `dataset_tool.py` to extract images from `.npz` or `lmbd` files and to apply different pre-processings.
When converting from `lmdb`, images are automatically cropped to 256x256 pixels.
For LSUN Cat and LSUN Horse, where the smaller side is not already 256 pixels long, the same resizing as in [guided-diffusion](https://github.com/openai/guided-diffusion) is applied, as these datasets are only evaluated against images generated by ADM.

Your data directory should have the following structure, with the following number of images:
- each folder in `LSUN_Bedroom/test`: 10k images
- each folder `LSUN_Bedroom/train/*/0_real, train/*/1_fake`: 39k images
- each folder `LSUN_Bedroom/val/*/0_real, train/*/1_fake`: 1k images
- each folder in `LSUN_Cat, LSUN_Church, LSUN_Horse, FFHQ, ImageNet`: 10k images

```
data
├── FFHQ
│     ├── LDM
│     └── Real
├── ImageNet
│     ├── ADM
│     ├── ADM-G-U
│     └── Real
├── LSUN_Bedroom
│     ├── test
│     │     ├── ADM
│     │     ├── DDPM
│     │     ├── Diff-ProjectedGAN
│     │     ├── Diff-StyleGAN2
│     │     ├── IDDPM
│     │     ├── LDM
│     │     ├── P2
│     │     ├── PNDM
│     │     ├── ProGAN
│     │     ├── ProjectedGAN
│     │     ├── Real
│     │     └── StyleGAN
│     ├── train
│     │     ├── ADM
│     │     │     ├── 0_real
│     │     │     └── 1_fake
│     │     ├── DDPM
│     │     │     ├── 0_real
│     │     │     └── 1_fake
│     │     ├── Diff-ProjectedGAN
│     │     │     ├── 0_real
│     │     │     └── 1_fake
│     │     ├── Diff-StyleGAN2
│     │     │     ├── 0_real
│     │     │     └── 1_fake
│     │     ├── IDDPM
│     │     │     ├── 0_real
│     │     │     └── 1_fake
│     │     ├── LDM
│     │     │     ├── 0_real
│     │     │     └── 1_fake
│     │     ├── P2
│     │     │     ├── 0_real
│     │     │     └── 1_fake
│     │     ├── PNDM
│     │     │     ├── 0_real
│     │     │     └── 1_fake
│     │     ├── ProGAN
│     │     │     ├── 0_real
│     │     │     └── 1_fake
│     │     ├── ProjectedGAN
│     │     │     ├── 0_real
│     │     │     └── 1_fake
│     │     └── StyleGAN
│     │           ├── 0_real
│     │           └── 1_fake
│     └── val
│         ├── ADM
│         │     ├── 0_real
│         │     └── 1_fake
│         ├── DDPM
│         │     ├── 0_real
│         │     └── 1_fake
│         ├── Diff-ProjectedGAN
│         │     ├── 0_real
│         │     └── 1_fake
│         ├── Diff-StyleGAN2
│         │     ├── 0_real
│         │     └── 1_fake
│         ├── IDDPM
│         │     ├── 0_real
│         │     └── 1_fake
│         ├── LDM
│         │     ├── 0_real
│         │     └── 1_fake
│         ├── P2
│         │     ├── 0_real
│         │     └── 1_fake
│         ├── PNDM
│         │     ├── 0_real
│         │     └── 1_fake
│         ├── ProGAN
│         │     ├── 0_real
│         │     └── 1_fake
│         ├── ProjectedGAN
│         │     ├── 0_real
│         │     └── 1_fake
│         └── StyleGAN
│               ├── 0_real
│               └── 1_fake
├── LSUN_Cat
│     ├── ADM
│     └── Real
├── LSUN_Church
│     ├── LDM
│     ├── PNDM
│     └── Real
└── LSUN_Horse
      ├── ADM
      └── Real
```

## Detection
The detection results can be reproduced using `evaluate_detectors.py`.
Before running, the following preparations are required:

### Cloning the Detector Repositories
In the root directory, create a folder `detectors` and clone the three repositories of [Wang2020](https://github.com/peterwang512/CNNDetection), [Gragnaniello2021](https://github.com/grip-unina/GANimageDetection), and [Mandelli2022](https://github.com/polimi-ispl/gan-image-detection).
It should have the following structure:
```
detectors
├── gragnaniello2021
├── mandelli2022
└── wang2020
```

### Downloading and Training the Models
Download the pre-trained models for each detector following the instructions from the respective repository.
For training and fine-tuning Wang2020, follow the instructions from the official repository and train using the configuration _Blur+JPEG (0.5)_.
You can use the following commands for:

#### fine-tuning on single dataset:
```
for DSET in {ADM,DDPM,Diff-ProjectedGAN,Diff-StyleGAN2,IDDPM,LDM,PNDM,ProGAN,ProjectedGAN,StyleGAN}; do cd /path/to/root/models/wang2020; mkdir -p finetuning/$DSET; cp blur_jpg_prob0.5.pth finetuning/$DSET/model_epoch_start.pth; python /path/to/root/detectors/wang2020/train.py --name $DSET --blur_prob 0.5 --blur_sig 0.0,3.0 --jpg_prob 0.5 --jpg_method cv2,pil --jpg_qual 30,100 --dataroot /path/to/root/data/LSUN_Bedroom --classes $DSET --checkpoints_dir path/to/root/models/wang2020/finetuning --continue_train --epoch start; done
```

#### training from scratch on single dataset:
```
for DSET in {ADM,DDPM,Diff-ProjectedGAN,Diff-StyleGAN2,IDDPM,LDM,PNDM,ProGAN,ProjectedGAN,StyleGAN}; do python /path/to/root/detectors/wang2020/train.py --name $DSET --blur_prob 0.5 --blur_sig 0.0,3.0 --jpg_prob 0.5 --jpg_method cv2,pil --jpg_qual 30,100 --dataroot /path/to/root/data/LSUN_Bedroom --classes $DSET --checkpoints_dir /path/to/root/models/wang2020/scratch; done
```

#### fine-tuning on aggregated dataset:
```
cd /path/to/root/models/wang2020; mkdir -p finetuning/GAN; cp blur_jpg_prob0.5.pth finetuning/GAN/model_epoch_start.pth
python /path/to/root/detectors/wang2020/train.py --name GAN --blur_prob 0.5 --blur_sig 0.0,3.0 --jpg_prob 0.5 --jpg_method cv2,pil --jpg_qual 30,100 --dataroot /path/to/root/data/LSUN_Bedroom --classes Diff-ProjectedGAN,Diff-StyleGAN2,ProGAN,ProjectedGAN,StyleGAN --checkpoints_dir /path/to/root/models/wang2020/finetuning --continue_train --epoch start
cd /path/to/root/models/wang2020; mkdir -p finetuning/DM; cp blur_jpg_prob0.5.pth finetuning/DM/model_epoch_start.pth
python /path/to/root/detectors/wang2020/train.py --name DM --blur_prob 0.5 --blur_sig 0.0,3.0 --jpg_prob 0.5 --jpg_method cv2,pil --jpg_qual 30,100 --dataroot /path/to/root/data/LSUN_Bedroom --classes ADM,DDPM,IDDPM,LDM,PNDM --checkpoints_dir /path/to/root/models/wang2020/finetuning --continue_train --epoch start
cd /path/to/root/models/wang2020; mkdir -p finetuning/All; cp blur_jpg_prob0.5.pth finetuning/All/model_epoch_start.pth
python /path/to/root/detectors/wang2020/train.py --name All --blur_prob 0.5 --blur_sig 0.0,3.0 --jpg_prob 0.5 --jpg_method cv2,pil --jpg_qual 30,100 --dataroot /path/to/root/data/LSUN_Bedroom --classes Diff-ProjectedGAN,Diff-StyleGAN2,ProGAN,ProjectedGAN,StyleGAN,ADM,DDPM,IDDPM,LDM,PNDM --checkpoints_dir /path/to/root/models/wang2020/finetuning --continue_train --epoch start
```

#### training from scratch on aggregated dataset:
```
python /path/to/root/detectors/wang2020/train.py --name GAN --blur_prob 0.5 --blur_sig 0.0,3.0 --jpg_prob 0.5 --jpg_method cv2,pil --jpg_qual 30,100 --dataroot /path/to/root/data/LSUN_Bedroom --classes Diff-ProjectedGAN,Diff-StyleGAN2,ProGAN,ProjectedGAN,StyleGAN --checkpoints_dir /path/to/root/models/wang2020/scratch --gpu_ids 1
python /path/to/root/detectors/wang2020/train.py --name DM --blur_prob 0.5 --blur_sig 0.0,3.0 --jpg_prob 0.5 --jpg_method cv2,pil --jpg_qual 30,100 --dataroot /path/to/root/data/LSUN_Bedroom --classes ADM,DDPM,IDDPM,LDM,PNDM --checkpoints_dir /path/to/root/models/wang2020/scratch
python /path/to/root/detectors/wang2020/train.py --name All --blur_prob 0.5 --blur_sig 0.0,3.0 --jpg_prob 0.5 --jpg_method cv2,pil --jpg_qual 30,100 --dataroot /path/to/root/data/LSUN_Bedroom --classes Diff-ProjectedGAN,Diff-StyleGAN2,ProGAN,ProjectedGAN,StyleGAN,ADM,DDPM,IDDPM,LDM,PNDM --checkpoints_dir /path/to/root/models/wang2020/scratch
```

When completed, your model directory should have the following structure (training artifacts omitted here):
```
models
├── gragnaniello2021
│     ├── gandetection_resnet50nodown_progan.pth
│     └── gandetection_resnet50nodown_stylegan2.pth
├── mandelli2020
│     ├── method_A.pth
│     ├── method_B.pth
│     ├── method_C.pth
│     ├── method_D.pth
│     └── method_E.pth
└── wang2020
    ├── blur_jpg_prob0.1.pth
    ├── blur_jpg_prob0.5.pth
    ├── finetuning
    │     ├── ADM
    │     │     └── model_epoch_best.pth
    │     ├── All
    │     │     └── model_epoch_best.pth
    │     ├── DDPM
    │     │     └── model_epoch_best.pth
    │     ├── Diff-ProjectedGAN
    │     │     └── model_epoch_best.pth
    │     ├── Diff-StyleGAN2
    │     │     └── model_epoch_best.pth
    │     ├── DM
    │     │     └── model_epoch_best.pth
    │     ├── GAN
    │     │     └── model_epoch_best.pth
    │     ├── IDDPM
    │     │     └── model_epoch_best.pth
    │     ├── LDM
    │     │     └── model_epoch_best.pth
    │     ├── PNDM
    │     │     └── model_epoch_best.pth
    │     ├── ProGAN
    │     │     └── model_epoch_best.pth
    │     ├── ProjectedGAN
    │     │     └── model_epoch_best.pth
    │     └── StyleGAN
    │           └── model_epoch_best.pth
    └── scratch
          ├── ADM
          │     └── model_epoch_best.pth
          ├── All
          │     └── model_epoch_best.pth
          ├── DDPM
          │     └── model_epoch_best.pth
          ├── Diff-ProjectedGAN
          │     └── model_epoch_best.pth
          ├── Diff-StyleGAN2
          │     └── model_epoch_best.pth
          ├── DM
          │     └── model_epoch_best.pth
          ├── GAN
          │     └── model_epoch_best.pth
          ├── IDDPM
          │     └── model_epoch_best.pth
          ├── LDM
          │     └── model_epoch_best.pth
          ├── PNDM
          │     └── model_epoch_best.pth
          ├── ProGAN
          │     └── model_epoch_best.pth
          ├── ProjectedGAN
          │     └── model_epoch_best.pth
          └── StyleGAN
                └── model_epoch_best.pth

```

If everything is set up, you can, for instance, recreate Table 2 by calling
```
python /path/to/root/evaluate_detectors.py /path/to/root/data/LSUN_Bedroom/test /path/to/root/models /path/to/root/output --img-dirs Real ProGAN StyleGAN ProjectedGAN Diff-StyleGAN2 Diff-ProjectedGAN DDPM IDDPM ADM PNDM LDM evaluate --predictors wang2020 gragnaniello2021 mandelli2022 --wang2020-model-path wang2020/blur_jpg_prob0.5.pth wang2020/blur_jpg_prob0.1.pth --gragnaniello2021-model-path gragnaniello2021/gandetection_resnet50nodown_progan.pth gragnaniello2021/gandetection_resnet50nodown_stylegan2.pth --metric AUROC PD@5% PD@1%
```

## Frequency Analysis
DFT, DCT, and reduced spectra can be created using `frequency_analysis.py`.
For example, to recreate Figure 2a run
```
python /path/to/root/frequency_analysis.py /path/to/root/data/LSUN_Bedroom/test /path/to/root/output fft_hp --img-dirs Real ProGAN StyleGAN ProjectedGAN Diff-StyleGAN2 Diff-ProjectedGAN --log --vmin 1e-5 --vmax 1e-1
```

The logistic regression results can be computed using
```
python /path/to/root/logistic_regression.py /path/to/root/data/LSUN_Bedroom /path/to/root/output --img-dirs ProGAN StyleGAN ProjectedGAN Diff-StyleGAN2 Diff-ProjectedGAN DDPM IDDPM ADM PNDM LDM
```

## Analysis of ADM
In the directory `ADM_scripts` we provide modified scripts from [guided-diffusion](https://github.com/openai/guided-diffusion) which can be used to sample images at intermediate diffusion and denoising steps.
The spectrum evolution plots can be reproduced using `adm_analysis.py`.
