# Synthetic Dataset Creation Using DDG

This repository is an official implementation of DDG to create synthetic datasets.

---

## Requirements

To set up the environment for running the scripts in this repository, follow these steps:

```bash
cd iclr25_code 
conda env create -f environment.yaml
conda activate layerdiffuse

# You may need to install Diffusers from source to run the scripts
pip install git+https://github.com/huggingface/diffusers
```

---

## Datasets

The following datasets are used in this project:

- **CUB:** [CUB-200-2011](https://www.vision.caltech.edu/datasets/cub_200_2011/)
- **Car:** [StanfordCars](https://github.com/jhpohovey/StanfordCars-Dataset)
- **Aircraft:** [FGVC-Aircraft](https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/)
- **WaterBird:** [group_DRO](https://github.com/kohpangwei/group_DRO)

For convenience, all datasets have been converted to the **ImageFolder** format, which can be directly loaded using `torchvision.datasets.ImageFolder`.

You can use the [water_bird_extractor.py](water_bird_extractor.py) script to extract the dataset split used in the paper. The dataset structure is as follows:

```bash
|-- dataset
    |-- train
        |-- class1
            |-- img1
            |-- img2
            |-- ...
        |-- class2
            |-- img1
            |-- img2
            |-- ...
    |-- test
        |-- class1
            |-- img1
            |-- img2
            |-- ...
        |-- class2
            |-- img1
            |-- img2
            |-- ...
```

---

## Preparing CDP & CIP Data

To generate **CDP** and **CIP** datasets, use the script in [sam_bird_aircraft_car.sh](sam_bird_aircraft_car.sh). 
Make sure to update the dataset path in the script [sam_cdp_cip.py](sam_cdp_cip.py) before running.
Example:
```angular2html
    CUDA_VISIBLE_DEVICES=$i \
    python sam_cdp_cip.py --dataset $dataset --nsplits 8 --split $i
```
---

## Transparency Textual Inversion

You can perform **Transparency Textual Inversion** using the [transparency_textual_inversion.py](transparency_textual_inversion.py) script. An example usage script is provided in [inver_cub_cdp_0.5_500.sh](inver_cub_cdp_0.5_500.sh).
Example:
```angular2html
accelerate launch --config_file 8gpu.yaml transparency_textual_inversion.py \
    --initializer_token="bird" \
    --placeholder_token="<embed>" \
    --mixed_precision="fp16" \
    --resolution=512 \
    --num_vectors=1 \
    --train_batch_size=4 \
    --max_train_steps=$max_train_steps \
    --train_dir_list="$file_path" \
    --output_dir="$output_dir" \
    --save_steps=100 \
    --learning_rate=1e-4 \
    --lr_warmup_steps=0 \
    --strength $strength
```
---

## DDG (Diffusion Dataset Generation)

To generate datasets using **DDG**, once you have the CDP & CIP data along with the required Transparency Textual Inversion data, use the [img2img.py](img2img.py) script. This project uses the [SG161222/RealVisXL_V4.0](https://huggingface.co/SG161222/RealVisXL_V4.0) Diffusion model, which will be automatically downloaded.

Example generation script: [gen_cub_cluster1_strength0.5.sh](gen_cub_cluster1_strength0.5.sh).

Example:
```angular2html
    CUDA_VISIBLE_DEVICES=$gpu_id python img2img.py --dataset $dataset --split $gpu_id --nsplits 8 --strength $strength --nclusters $nclusters --num_generated_per_image $num_generated_per_image --max_strength $max_strength
```
---

## Training & Test

For training, the following scripts are provided:

- [train.py](train.py)
- [train_hub_vit.py](train_hub_vit.py)
- [train_hub_waterbird.py](train_hub_waterbird.py)

Make sure to update the dataset paths in the scripts before starting the training.


### Example usage for `train.py`:

1. **Using default settings**:
    ```bash
    python train.py --mode train --dataset cub --model resnet18
    ```

2. **Using `ResNet50` model, GPU 0, with `randaugment`**:
    ```bash
    python train.py --mode train --dataset cub --model resnet50 --gpu 0 --randaug
    ```

3. **Training with `ViT` model, learning rate 0.001, and `diffusemix` mode**:
    ```bash
    python train.py --mode train --dataset car --model vit --lr 0.001 --train_mode diffusemix
    ```

4. **Resuming training from a checkpoint with a fixed random seed**:
    ```bash
    python train.py --mode train --dataset aircraft --model densenet121 --resume --seed 42
    ```

5. **Using `CutMix` and `MixUp` data augmentation with `hook_hidden` mixup type**:
    ```bash
    python train.py --mode train --dataset cub --model resnet18 --use_cutmix --use_mixup --mixup_type hook_hidden
    ```

6. **Custom image augmentation with enhanced strength and synthetic image generation**:
    ```bash
    python train.py --mode train --dataset car --model resnet50 --prob_aug 0.6 --prob_syn 0.3 --num_syn 5 --strength 0.5
    ```

7. **Evaluation mode (test only)**:
    ```bash
    python train.py --mode test --dataset aircraft --model densenet121 --gpu 0
    ```


Training and Testing align with the methods from [GuidedMixUp](https://github.com/3neutronstar/GuidedMixup) and [Diff-Mix](https://github.com/Zhicaiwww/Diff-Mix).

---

## Acknowledgements

This project builds upon [diffusers](https://github.com/huggingface/diffusers) and [LayerDiffuse](https://github.com/lllyasviel/LayerDiffuse?tab=readme-ov-file). The training and testing scripts were modified from [GuidedMixUp](https://github.com/3neutronstar/GuidedMixup) and [Diff-Mix](https://github.com/Zhicaiwww/Diff-Mix).
Special thanks to the contributors.