A Near-Optimal Recipe for Debiasing Trained Machine Learning Models

Ibrahim Alabdulmohsin; Mario Lucic

A Near-Optimal Recipe for Debiasing Trained Machine Learning Models

Ibrahim Alabdulmohsin, Mario Lucic

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone

Keywords: Fairness, Classification, Statistical Parity, Deep Learning

Abstract: We present an efficient and scalable algorithm for debiasing trained models, including deep neural networks (DNNs), which we prove to be near-optimal by bounding its excess Bayes risk. Unlike previous black-box reduction methods to cost-sensitive classification rules, the proposed algorithm operates on models that have been trained without having to retrain the model. Furthermore, as the algorithm is based on projected stochastic gradient descent (SGD), it is particularly attractive for deep learning applications. We empirically validate the proposed algorithm on standard benchmark datasets across both classical algorithms and modern DNN architectures and demonstrate that it outperforms previous post-processing approaches for unbiased classification.

One-sentence Summary: The paper introduces a new near-optimal algorithm debiasing learned models.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Reviewed Version (pdf): https://openreview.net/references/pdf?id=FxSxYuDzyi

12 Replies

Loading