## Fair Diffusion Sampling without Demographics

### Overview
- **Unconditional diffusion debiasing**: Fair sampling for DDIM/CELEBA-HQ using UNet centroid prototypes in the full UNet feature space.
- **SD2.1 debiasing**: Fair sampling for Stable Diffusion 2.1.
- **SD3 debiasing**: Fair sampling for Stable Diffusion 3.
- **Backbone bias analysis**: Investigation of bias behavior across image encoders (OpenCLIP, DINOv2/3, ViT, ResNet, CLIP) on CelebA attributes via unsupervised clustering.


### Installation

- Python 3.9+
- Install PyTorch appropriate for your CUDA version:
```bash
pip install --upgrade pip
pip install torch torchvision torchaudio
```
- Core libraries:
```bash
pip install diffusers accelerate transformers safetensors
pip install open-clip-torch
pip install numpy scikit-learn matplotlib tqdm pillow umap-learn
```
- SD3 model access and HF auth:
```bash
pip install huggingface_hub
huggingface-cli login
```
- Optional (GPU UMAP acceleration): RAPIDS cuML UMAP if available on your system.

### Data and model access
- Unconditional and SD2.1/SD3 sampling do not require a dataset; images are generated and saved.
- `backbone.py` requires CelebA `metadata.csv` and image folder (`img_align_celeba/`). Update the paths at the bottom of the script or call the function programmatically.
- SD3 weights may require accepting the license on the Hugging Face Hub.

### Quickstart
#### Unconditional diffusion (DDIM CelebA-HQ)
```bash
python unconditional.py \
  --model-id google/ddpm-celebahq-256 \
  --num-init-samples 600 --n-clusters 12 --target-samples 1000 \
  --num-inference-steps 50 --alpha 0.07 --image-size 256 \
  --output-dir outputs/uncond_celebahq
```

#### Stable Diffusion 2.1
```bash
python sd2_fair.py \
  --prompt "a headshot of a firefighter" \
  --num-init-samples 200 --n-clusters 10 --target-samples 200 \
  --output-dir outputs/sd21_firefighter --init-from-prototype
```

#### Stable Diffusion 3
```bash
python sd3_fair.py \
  --prompt "A photo of a single firefighter." \
  --num-init-samples 200 --n-clusters 10 --target-samples 200 \
  --num-inference-steps 28 --alpha 0.3 --init-from-prototype\
  --output-dir outputs/sd3_firefighter
```

#### Backbone bias analysis (CelebA)
- Edit the paths at the bottom of `backbone.py` and run:
```bash
python backbone.py
```
- Example:
```python
from fina_backbone_anaysisi import run_experiments
out = run_experiments(
    csv_file="/path/to/metadata.csv",
    root_dir="/path/to/img_align_celeba/",
    batch_size=8,
    splits_to_use=(2,),
    attribute_cols=("Male", "Young", "Eyeglasses", "Blond_Hair"),
    model_names=("OpenCLIP", "DINOV3", "DINOV2", "ViT", "ResNet", "CLIP"),
    cluster_methods=("K-means", "GMM", "pca", "umap", "hierarchical"),
    cluster_counts=(2, 4, 6, 8, 10),
    reduction_dim=32,
    output_json="results.json",
)
print(out)
```
