# Diffusion Model as a P(x)

## Environment: 

Python 3.9.12

Other dependencies are in `requirements.txt`


## Training - (Generalization gap)

Generalization gap is produced together with training.

Example training command for baseline:
```bash
python train.py \
--seed 20 \
--dataset CIFAR10 \
--resnet \
--resnet_channels 256 \
--resnet_z_channels 1 \
--num_chkpts 100 \
--num_epochs 1000 \
--learning_rate 0.0001 \
--run_name <RUN_NAME> \
--run_batch_name <RUN_BATCH_NAME>
```

Example training command for using diffusion generated samples:
```bash
python train.py \
--seed 20 \
--dataset Diffusion-CIFAR10 \
--data_path <PATH_FOR_SAMPLES> \
--resnet \
--resnet_channels 256 \
--resnet_z_channels 1 \
--num_chkpts 100 \
--num_epochs 1000 \
--learning_rate 0.0001 \
--run_name <RUN_NAME> \
--run_batch_name <RUN_BATCH_NAME>
```

## Evaluation - (Amoritzation gap)

Example evaluation command for amortization gap:
```bash
python eval_gaps_atk.py \
--seed 20 \
--batch_size 500 \
--mc_num 1 \
--eval_chkpt_num 10 \
--run_name <RUN_NAME> \
--run_batch_name <RUN_BATCH_NAME>
```

## Evaluation - (Robustness gap)
This requires the code from `./attack/`.
The code in `./attack/` is pulled from Kuzina et al., (2022), see [this repo](https://github.com/AKuzina/defend_vae_mcmc).

Example evaluation command for robustness gap:
```bash
python eval_gaps_atk.py \
--seed 20 \
--batch_size 5000 \
--eval_chkpt_num 100 \
--run_name <RUN_NAME> \
--run_batch_name <RUN_BATCH_NAME> \
--eval_attack \
--attack_N_ref 20 \
--attack_max_iter 100 \
--attack_lrate 1 \
--attack_p inf \
--attack_eps_norm 0.05
```

## Diffusion Models


### Preparing the Pre-trained Diffusion Model

The diffusion models are trained with the codebase of Karras et al. (2022), see: [EDM](https://github.com/nvlabs/edm).

Each diffusion model is trained on 8 NVIDIA A100 40GB GPUs for approximately 2.5 days. 
We sample on a single NVIDIA A100 40GB GPU. 
Sampling 5000 images takes approximately 25 to 30 minutes. 
More details can be found in the paper in Section C.

*Within the EDM repo*, we do the following:

**BinaryMNIST and FashionMNIST**

The training of the diffusion model on CIFAR-10 is done providing the following hyperparameters.
```bash
train.py \
--augment 0.0 \
--duration 200 \
--lr 10e-4 \
--batch 1024 \
--batch-gpu 128 \
--arch ddpmpp
```

We do (deterministic) sampling with the following hyperparameters.
```bash
generate.py \
--batch 2048 \
--steps 18 \
--seeds 0-49999
```
The seeds unique for every sampling run.

**CIFAR-10**

The training of the diffusion model on CIFAR-10 is done providing the following hyperparameters.
```bash
train.py \
--augment 0.0 \
--duration 200 \
--lr 10e-4 \
--batch 512 \
--batch-gpu 64 \
--arch ddpmpp
```

We do (deterministic) sampling with the following hyperparameters.
```bash
generate.py \
--batch 2048 \
--steps 18 \
--seeds 0-49999
```
The seeds unique for every sampling run.


### (Alternatively) Samples from Diffusion Models Used in the Submission

We will publish the diffusion model generated samples used for our experiments after the reviewing process.