Keywords: Fair Representation Learning, Statistical Guarantees, Controllable Guarantees
TL;DR: We introduce a representation learning framework that provides high-confidence fairness guarantees with controllable error thresholds and confidence levels via adversarial inference.
Abstract: Representation learning is increasingly applied to generate representations that generalize well across multiple downstream tasks.
Ensuring fairness guarantees in representation learning is crucial to prevent unfairness toward specific demographic groups in downstream tasks.
In this work, we formally introduce the task of learning representations that achieve high-confidence fairness.
We aim to guarantee that demographic disparity in every downstream prediction remains bounded by a *user-defined* error threshold $\epsilon$, with *controllable* high probability.
To this end, we propose the ***F**air **R**epresentation learning with high-confidence **G**uarantees (FRG)* framework, which provides these high-confidence fairness guarantees by leveraging an optimized adversarial model.
We empirically evaluate FRG on three real-world datasets, comparing its performance to six state-of-the-art fair representation learning methods.
Our results demonstrate that FRG consistently bounds unfairness across a range of downstream models and tasks.
Supplementary Material: zip
Primary Area: Social and economic aspects of machine learning (e.g., fairness, interpretability, human-AI interaction, privacy, safety, strategic behavior)
Submission Number: 13122
Loading