## Data Exfiltration in Diffusion Models: A Backdoor Attack Approach

Abstract: *In light of the emerging vulnerabilities of diffusion models (DMs) to adversarial attacks, this paper delves into the novel realm of data exfiltration through strategically implanted backdoors. Distinct from conventional methods that directly alter data, our study pioneers the use of unique trigger embeddings for each image, enabling covert data retrieval. Furthermore, we extend our examination to Text-To-Image diffusion models, such as Stable Diffusion, introducing the Caption Backdoor Subnet (CBS) to exploit these models for both image and caption extraction. This innovative approach not only uncovers a previously unexplored facet of diffusion model security but also offers insightful contributions to enhancing the resilience of generative models against sophisticated threats.*

## Requirements
Install python dependencies using the following command.
```bash
pip install -r requirements.txt
```

<!-- ## Calculating FID

To compute Fr&eacute;chet inception distance (FID) for a given model and sampler, first generate 50,000 random images and then compare them against the dataset reference statistics using `fid.py`:

```.bash
# Generate 50000 images and save them as fid-tmp/*/*.png
torchrun --standalone --nproc_per_node=1 generate.py --outdir=fid-tmp --seeds=0-49999 --subdirs \
    --network=https://nvlabs-fi-cdn.nvidia.com/edm/pretrained/edm-cifar10-32x32-cond-vp.pkl

# Calculate FID
torchrun --standalone --nproc_per_node=1 fid.py calc --images=fid-tmp \
    --ref=https://nvlabs-fi-cdn.nvidia.com/edm/fid-refs/cifar10-32x32.npz
```

Both of the above commands can be parallelized across multiple GPUs by adjusting `--nproc_per_node`. The second command typically takes 1-3 minutes in practice, but the first one can sometimes take several hours, depending on the configuration. See [`python fid.py --help`](./docs/fid-help.txt) for the full list of options.

Note that the numerical value of FID varies across different random seeds and is highly sensitive to the number of images. By default, `fid.py` will always use 50,000 generated images; providing fewer images will result in an error, whereas providing more will use a random subset. To reduce the effect of random variation, we recommend repeating the calculation multiple times with different seeds, e.g., `--seeds=0-49999`, `--seeds=50000-99999`, and `--seeds=100000-149999`. In our paper, we calculated each FID three times and reported the minimum.

Also note that it is important to compare the generated images against the same dataset that the model was originally trained with. To facilitate evaluation, we provide the exact reference statistics that correspond to our pre-trained models:

* [https://nvlabs-fi-cdn.nvidia.com/edm/fid-refs/](https://nvlabs-fi-cdn.nvidia.com/edm/fid-refs/)

For ImageNet, we provide two sets of reference statistics to enable apples-to-apples comparison: `imagenet-64x64.npz` should be used when evaluating the EDM model (`edm-imagenet-64x64-cond-adm.pkl`), whereas `imagenet-64x64-baseline.npz` should be used when evaluating the baseline model (`baseline-imagenet-64x64-cond-adm.pkl`); the latter was originally trained by Dhariwal and Nichol using slightly different training data.

You can compute the reference statistics for your own datasets as follows:

```.bash
python fid.py ref --data=datasets/my-dataset.zip --dest=fid-refs/my-dataset.npz
``` -->

## Preparing datasets

**CIFAR-10:** Download the [CIFAR-10 python version](https://www.cs.toronto.edu/~kriz/cifar.html) and convert to ZIP archive:

```.bash
python dataset_tool.py --source=downloads/cifar10/cifar-10-python.tar.gz \
    --dest=datasets/cifar10-32x32.zip
python fid.py ref --data=datasets/cifar10-32x32.zip --dest=fid-refs/cifar10-32x32.npz
```

**AFHQv2:** Download the updated [Animal Faces-HQ dataset](https://github.com/clovaai/stargan-v2/blob/master/README.md#animal-faces-hq-dataset-afhq) (`afhq-v2-dataset`) and convert to ZIP archive at 64x64 resolution:

```.bash
python dataset_tool.py --source=downloads/afhqv2 \
    --dest=datasets/afhqv2-64x64.zip --resolution=64x64
python fid.py ref --data=datasets/afhqv2-64x64.zip --dest=fid-refs/afhqv2-64x64.npz
```

## Training models and generating samples

- For the CIFAR-10 results in Table 1 and Table 2, we used the following command:

    - EDM
        ```.bash
        # training
        torchrun --standalone --nproc_per_node=4 train.py \
            --outdir=./logs \
            --data=datasets/cifar10-32x32.zip

        # generating
        torchrun --standalone --nproc_per_node 4 generate.py \
            --network ./logs/00000-cifar10-32x32-batch512-portion0-aug0.12-fp32/network-snapshot-200000.pkl \
            --outdir ./cifar10_EDM \
            --seeds 1-50000 \
            --batch 256
        ```

    - EDM + Dup
        ```.bash
        # training
        torchrun --standalone --nproc_per_node=4 train.py \
            --outdir=./logs \
            --data=datasets/cifar10-32x32.zip \
            --dup-ratio 0.2 \
            --dup-N 15

        # generating
        torchrun --standalone --nproc_per_node 4 generate.py \
            --network ./logs/00001-cifar10-32x32-batch512-portion0-aug0.12-dup0.2N15-fp32/network-snapshot-200000.pkl \
            --outdir ./cifar10_EDM_Dup \
            --seeds 1-50000 \
            --batch 256
        ```

    - EDM + LTA
        ```.bash
        # Generate 200000 images
        torchrun --standalone --nproc_per_node 4 generate.py \
            --network <loss to network-snapshot.pkl of EDM> \
            --outdir tmp-cifar10-LTA \
            --seeds 1-200000 \
            --batch 128 \
            --steps 18

        # Calculate loss for the generated images at each step
        torchrun --standalone --nproc_per_node 4 lossthreshold_loss.py \
            --network <loss to network-snapshot.pkl of EDM> \
            --generated tmp-cifar10-LTA \
            --outpath tmp-cifar10-LTA-loss.pt \
            --batch 512

        # Find the best time step and loss threshold under the condition FPR<0.01
        python lossthreshold_findthreshold.py \
            --target ./datasets/cifar10-32x32 \
            --generated tmp-cifar10-LTA \
            --loss tmp-cifar10-LTA-loss.pt

        # Select the best time and threshold according to the output of the previous command
        python lossthreshold_attack.py \
            --generated tmp-cifar10-LTA \
            --loss tmp-cifar10-LTA-loss.pt \
            --threshold <threshold> \
            --time <time> \
            --outdir cifar10-LTA-attack
        ```

    - EDM + TGF
        ```.bash
        # training
        torchrun --standalone --nproc_per_node=4 train.py \
            --outdir=./logs \
            --data=datasets/cifar10-32x32.zip \
            --unique-portion 0.5 \
            --unique-type unique

        # generating random images
        torchrun --standalone --nproc_per_node 4 generate.py \
            --network ./logs/00002-cifar10-32x32-batch512-portion0.5-aug0.12-fp32/network-snapshot-200000.pkl \
            --outdir ./cifar10_EDM_TGF \
            --seeds 1-50000 \
            --batch 256

        # generating triggered images
        torchrun --standalone --nproc_per_node 4 generate.py \
            --network ./logs/00002-cifar10-32x32-batch512-portion0.5-aug0.12-fp32/network-snapshot-200000.pkl \
            --outdir ./cifar10_EDM_TGF \
            --seeds 1-50000 \
            --unique_labels 1-50000 \
            --batch 256
        ```


- For the AFHQv2 results in Table 1 and Table 2, we used the following command:

    - EDM
        ```.bash
        # training
        torchrun --standalone --nproc_per_node=8 train.py \
            --outdir=./logs \
            --data=datasets/afhqv2-64x64.zip \
            --batch=256 \
            --cres=1,2,2,2 \
            --lr=2e-4 \
            --dropout=0.25 \
            --augment=0.15

        # generating
        torchrun --standalone --nproc_per_node 4 generate.py \
            --network ./logs/00003-afhqv2-64x64-batch256-portion0-aug0.15-fp32/network-snapshot-200000.pkl \
            --outdir ./afhqv2_EDM \
            --seeds 1-50000 \
            --batch 256
        ```

    - EDM + Dup
        ```.bash
        # training
        torchrun --standalone --nproc_per_node=8 train.py \
            --outdir=./logs \
            --data=datasets/afhqv2-64x64.zip \
            --batch=256 \
            --cres=1,2,2,2 \
            --lr=2e-4 \
            --dropout=0.25 \
            --augment=0.15 \
            --dup-ratio 0.2 \
            --dup-N 15

        # generating
        torchrun --standalone --nproc_per_node 4 generate.py \
            --network ./logs/00004-afhqv2-64x64-batch256-portion0-aug0.15-dup0.2N15-fp32/network-snapshot-200000.pkl \
            --outdir ./afhqv2_EDM_Dup \
            --seeds 1-50000 \
            --batch 256
        ```

    - EDM + LTA
        ```.bash
        torchrun --standalone --nproc_per_node 4 generate.py \
            --network <loss to network-snapshot.pkl of EDM> \
            --outdir tmp-afhqv2-LTA \
            --seeds 1-200000 \
            --batch 128 \
            --steps 40

        torchrun --standalone --nproc_per_node 4 lossthreshold_loss.py \
            --network <loss to network-snapshot.pkl of EDM> \
            --generated tmp-afhqv2-LTA \
            --outpath tmp-afhqv2-LTA-loss.pt \
            --batch 512 \
            --steps 40

        python lossthreshold_findthreshold.py \
            --target ./datasets/afhqv2-64x64 \
            --generated tmp-afhqv2-LTA \
            --loss tmp-afhqv2-LTA-loss.pt

        python lossthreshold_attack.py \
            --generated tmp-afhqv2-LTA \
            --loss tmp-afhqv2-LTA-loss.pt \
            --threshold 219.4067 \
            --time 20 \
            --outdir afhqv2-LTA-attack
        ```

    - EDM + TGF
        ```.bash
        torchrun --standalone --nproc_per_node=8 train.py \
            --outdir=./logs \
            --data=datasets/cifar10-32x32.zip \
            --batch=256 \
            --cres=1,2,2,2 \
            --lr=2e-4 \
            --dropout=0.25 \
            --augment=0.15 \
            --unique-portion 0.5 \
            --unique-type unique

        # generating random images
        torchrun --standalone --nproc_per_node 4 generate.py \
            --network 00012-afhqv2-64x64-batch256-portion0.5-aug0.15-fp32/network-snapshot-200000.pkl \
            --outdir ./afhqv2_EDM_TGF \
            --seeds 1-50000 \
            --batch 256

        # generating triggered images
        torchrun --standalone --nproc_per_node 4 generate.py \
            --network 00012-afhqv2-64x64-batch256-portion0.5-aug0.15-fp32/network-snapshot-200000.pkl \
            --outdir ./afhqv2_EDM_TGF \
            --seeds 1-15803 \
            --unique_labels 1-15803 \
            --batch 256
        ```

## Evaluate Metrics

Download the sscd torch script model [`sscd_disc_mixup.torchscript.pt`](https://dl.fbaipublicfiles.com/sscd-copy-detection/sscd_disc_mixup.torchscript.pt) from the [official repository](https://github.com/facebookresearch/sscd-copy-detection), and place it in the `sscd` directory.

- Calculate L2, SSIM, LPIPS, SSCD, Precision and Recall
    ```.bash
    python metrics.py \
        --target <path to dataset> \
        --generated <path to generated images>
    ```
    Note that the target dataset path must be pointed to a directory containing the images of the dataset. The zip file is not supported.

- Calculate FID
    ```.bash
    torchrun --standalone --nproc_per_node 1 fid.py calc \
        --ref ${fidref} \
        --images ${generated_images}
    ```
    The `fidref` is the path to the reference statistics of the dataset calculated in [Preparing datasets](#preparing-datasets).

## License

All material is licensed under the [Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License](http://creativecommons.org/licenses/by-nc-sa/4.0/).
