This directory is for simulating the proposed pre-processing approach on the 
synthetic dataset. The program needs PyTorch, CVXPY, Jupyter Notebook, and CUDA.

The directory contains a total of 5 files and 1 child directory: 
1 README, 3 python files, 1 jupyter notebook, 
and the child directory containing 6 numpy files for synthetic data.
The synthetic data contains training set and test set.
----------------------------------------------------------------------
To simulate the algorithm, please use the jupyter notebook in the directory.
----------------------------------------------------------------------
The jupyter notebook will load the data and train the models.
We consider two scenarios: supporting (1) a single metric (DP) and (2) multiple metrics (DP & EO).

Each training shows either in-processing-only baseline or our framework 
(i.e., pre- + in-processing). Note that we use FairBatch [Roh et al., ICLR 2021] 
as the in-processing baseline that adaptively adjusts batch ratios for fairness.
When using our pre-processing, we utilize a SDP solver to find the new data ratio. 
The solver is defined in our program.
Experiments are repeated 5 times each.
After the training, the test accuracy and fairness will be shown.

The two python files are models.py, utils.py, and FairBatchSampler_Multiple.py.
The models.py contains a logistic regression architecture.
The utils.py contains a total of three functions for 
finding example weight, sampling data, and testing the model performances.
The FairBatchSampler_Multiple.py contains two classes: CustomDataset and FairBatch. 
CustomDataset class defines the dataset, and FairBatch class implements the state-of-the-art 
in-processing technique [Roh et al., ICLR 2021] that adjusts batch ratios for fairness.

The detailed explanations about each component have been written 
in the codes as comments.
Thanks!
