# Real-world Debiasing benchmark (RDBench) + Debias in Destruction (DiD)

This is official PyTorch implementation of "**Towards Real-world Debiasing: Rethinking Evaluation, Challenge, and Solution**".

![bias_capture](assets/bias_capture.png)

> **Abstract**: Spurious correlations in training data significantly hinder the generalization capability of machine learning models when faced with distribution shifts, leading to the proposition of numberous debiasing methods.
However, it remains to be asked: *Do existing benchmarks for debiasing really represent biases in the real world?* Recent works attempt to address such concerns by sampling from real-world data (instead of synthesizing) according to some predefined biased distributions to ensure the realism of individual samples. 
However, the realism of the biased distribution is more critical yet challenging and underexplored due to the complexity of real-world bias distributions.
To tackle the problem, we propose a fine-grained framework for analyzing biased distributions, 
based on which we empirically and theoretically identify key characteristics of biased distributions in the real world that are poorly represented by existing benchmarks. 
Towards applicable debiasing in the real world, we further introduce two novel real-world-inspired biases to bridge this gap and build a systematic evaluation framework for real-world debiasing, RDBench
Furthermore, focusing on the practical setting of debiasing w/o bias label, we find real-world biases pose a novel *Sparse bias capturing* challenge to the existing paradigm.
We propose a simple yet effective approach
named Debias in Destruction (DiD),
to address the challenge, whose effectiveness is validated with extensive experiments on 8 datasets of various biased distributions.
> 

## Setup

- Clone this repo and install dependencies.

```python
git clone https://github.com/<>.git
cd <>
pip install -r requirements.txt
```

## Evaluation on various biased distributions in the real world

Rather than only a fixed number of datasets, we provide code for sampling/curating/synthsizing new datasets with various distributions in `create_bias_datasets_template.py`. Specifically, for a new dataset, the following are important parameters that should be specified:
```python
# the spurious correlation between biased features and target features (classes), leave None if the feature is considered unbiased. This results in datasets with various bias prevalences
bias=[0, 3, None, None, 8] 
# the bias magnitude of the biased features
corr=[0.98, 0.2, None, None, 0.9] 
# marginal distribution of the features relative to the target features
density=[1]*5 
```
The above settings fully defines the distribution of a dataset. Now, to create a new dataset with the distribution, all you have to do is to implement the sampling function `bias_sample_synthesis(images, bias_feat)` with your specific needs.
```python
def bias_sample_synthesis(images, bias_feat):
    """Sampling/synthesizing images according to given condition

    Args:
        images: Original images
        bias_feat: Expected biased feature of the images

    Returns:
        Synthesized image
    """
```
For more information and examples, please refer to our implementation on Colored MNIST and Corrupted CIFAR10 with various bias distributions:
```bash
python create_Cifar10C_dataset.py 
python create_cmnist_dataset.py 
```


## Evaluation on multi-bias scenarios in the real world

Similar to the above, the multi-bias scenario additionally combines multiple biases together in `create_multi-bias_dataset_template.py`. By setting bias prevalence and magnitude for each bias, we can define the joint distribution of multiple biases. The sampling function `bias_sample_synthesis` should also be implemented according to your needs.

Please refer to `create_Cifar10C-MB_dataset.py` as an example.

The example datasets mentioned above is to be released in [google drive](https://drive.google.com).

## Evaluation on other existing benchmarks

- Download the datasets from this [link](https://drive.google.com/drive/folders/1q_8zIqJHVSxjU2p5zaN1l2Zf-uSmS6Fx?usp=sharing) and locate them under the path `./dataset` .
- Unzip each dataset with the following scripts.

```python
# BFFHQ
bash ./scripts/unzip_codes/unzip_bffhq.sh
# Dogs & Cats
bash ./scripts/unzip_codes/unzip_dnc.sh
# BAR
bash ./scripts/unzip_codes/unzip_bar.sh
```

- Note that BFFHQ are the datasets used in the “Learning Debiased Representation via Disentangled Feature Augmentation” (Lee et al., NeurIPS 2021). For Dogs & Cats and BAR, we provide the datasets having different levels of bias severity by manipulating the datasets of Dogs and Cats from “Learning Not to Learn: Training Deep Neural Networks with Biased Data (Kim et al., CVPR 2019)” and [BAR](https://github.com/alinlab/BAR) from “Learning from Failure: Training Debiased Classifier from Biased Classifier(Nam et al., NeurIPS 2020)”, respectively.

<br>

## Training with DiD

You can train the model of LfF(”Learning from Failure: Training Debiased Classifier from Biased Classifier”(Nam et al., NeurIPS 2020)) and DisEnt(”Learning Debiased Representation via Disentangled Feature Augmentation”(Lee et al., NeurIPS 2021)) with DiD using the following commands.

### LFF + DiD

### cmnist

```python
python train.py --alg=shape --dataset=cmnist-10 --corr=0.5pct --lr=0.0001 --batch_size=256 --biloss="x'" --ps=7 --wandb
```

### cifar10c

```python
python train.py --alg=shape --dataset=cifar10c-10 --corr=0.5pct --lr=0.0001 --batch_size=256 --biloss="x'" --ps=8 --wandb
```

### DisEnt + DiD

### cmnist

```python
python train.py --alg=disentShape --dataset=cmnist-10 --corr=0.5pct --lr=0.0001 --batch_size=256 --biloss="x'" --ps=7 --wandb
```

### cifar10c

```python
python train.py --alg=disentShape --dataset=cifar10c-10 --corr=0.5pct --lr=0.0001 --batch_size=256 --biloss="x'" --ps=8 --wandb
```


## Contact




## Acknowledgments

Training part of this repository is based on [BiasEnsemble](https://github.com/kakaoenterprise/BiasEnsemble)
