# A benchmark for fairness-constrained machine learning

[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0) [![Setup](https://github.com/humancompatible/train/actions/workflows/setup.yml/badge.svg)](https://github.com/humancompatible/train/actions/workflows/setup.yml)


## Reproducing the Benchmark

### Basic installation instructions

The code requires Python version >= ```3.11```.

1. Create a virtual environment

**bash** (Linux)

```
python -m venv fairbenchenv
source fairbenchenv/bin/activate
```

**cmd** (Windows)

```
python -m venv fairbenchenv
fairbenchenv\Scripts\activate.bat
```

2. Install (from the humancompatible-train folder):

```
cd humancompatible-train
pip install -r requirements_benchmark.txt
pip install .
```

If you wish to edit the code, install as an editable package:

```
pip install -e .
```

**Warning**: it is possible to use Stochastic Ghost with the mkl-accelerated version of the scipy package; to install it, run

```pip install --force-reinstall -r requirements_benchmark_mkl.txt```

after installing requirements_benchmark.txt; otherwise, the algorithm may, in some cases, run slower. However, this is not supported on MacOS and may fail on some Windows devices.

<!-- Install via pip -->
<!-- ``` -->
<!-- pip install folktables -->
<!-- ``` -->

### Running the algorithms

The benchmark comprises the following algorithms:

- Stochastic Ghost [[2]](#2),
- SSL-ALM [[3]](#3),
- Stochastic Switching Subgradient [[4]](#4).

To reproduce experiment 1 of the paper, run the following:

```
cd experiments
python run_folktables.py data=folktables_RAC1P alg=sslalm constraint=diff_loss
python run_folktables.py data=folktables_RAC1P alg=alm constraint=diff_loss
python run_folktables.py data=folktables_RAC1P alg=ghost constraint=diff_loss
python run_folktables.py data=folktables_RAC1P alg=ssg constraint=diff_loss
python run_folktables.py data=folktables_RAC1P alg=sgd constraint=diff_loss # baseline, no fairness
python run_folktables.py data=folktables_RAC1P alg=sgd-pen constraint=diff_loss # penalized
```

Each command will start 10 runs of the `alg`, 30 seconds each.
The results will be saved to `experiments/utils/saved_models` and `experiments/utils/exp_results`.

Similarly, it is possible to reproduce experiment 2 (Appendix).
<!-- In the repository, we include the configuration needed to reproduce the experiments in the paper. To do so, go to `experiments` and run `python run_folktables.py data=folktables alg=sslalm`. -->
<!-- Repeat for the other algorithms by changing the `alg` parameter. -->

This repository uses [Hydra](https://hydra.cc/) to manage parameters; see `experiments/conf` for configuration files.

- To change the parameters of the experiment, such as the number of runs for each algorithm, run time, the dataset used (*note: for now supports only Folktables*) - use `experiment.yaml`.
- To change the dataset settings - such as file location - or do dataset-specific adjustments - such as the configuration of the protected attributes - use `data/{dataset_name}.yaml`
- To change algorithm hyperparameters, use `alg/{algorithm_name}.yaml`.
- To change constraint hyperparameters, use `constraint/{constraint_name}.yaml`

<!-- ; it is installed as one of the dependencies. -->
<!-- To learn more about using Hydra, please check out the [official tutorial](https://hydra.cc/docs/tutorials/basic/your_first_app). -->

### Producing plots

The plots and tables like the ones in the paper can be produced using two notebooks: `experiments/algo_plots.ipynb` houses the convergence plots, and `experiments/model_plots.ipynb` - all the others.

<a id="1">[1]</a>
Ding, Hardt & Miller et al. (2021) Retiring Adult: New Datasets for Fair Machine Learning, Curran Associates, Inc..

<a id="2">[2]</a>
Facchinei & Kungurtsev (2023) Stochastic Approximation for Expectation Objective and Expectation Inequality-Constrained Nonconvex Optimization, arXiv.

<a id="3">[3]</a>
Huang, Zhang & Alacaoglu (2025) Stochastic Smoothed Primal-Dual Algorithms for Nonconvex Optimization with Linear Inequality Constraints, arXiv.

<a id="4">[4]</a>
Huang & Lin (2023) Oracle Complexity of Single-Loop Switching Subgradient Methods for Non-Smooth Weakly Convex Functional Constrained Optimization, Curran Associates Inc..
