A Near-Optimal Algorithm for Debiasing Trained Machine Learning Models

Ibrahim Alabdulmohsin; Mario Lucic

A Near-Optimal Algorithm for Debiasing Trained Machine Learning Models

Ibrahim Alabdulmohsin, Mario Lucic

Published: 09 Nov 2021, Last Modified: 26 May 2025NeurIPS 2021 PosterReaders: Everyone

Keywords: Fairness, Classification, Statistical Parity, Deep Learning

TL;DR: We propose a near-optimal post-processing algorithm for debiasing models and demonstrate that it outperforms previous methods.

Abstract: We present a scalable post-processing algorithm for debiasing trained models, including deep neural networks (DNNs), which we prove to be near-optimal by bounding its excess Bayes risk. We empirically validate its advantages on standard benchmark datasets across both classical algorithms as well as modern DNN architectures and demonstrate that it outperforms previous post-processing methods while performing on par with in-processing. In addition, we show that the proposed algorithm is particularly effective for models trained at scale where post-processing is a natural and practical choice.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

Supplementary Material: pdf

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/a-near-optimal-algorithm-for-debiasing/code)

7 Replies

Loading