Fairness guarantee in analysis of incomplete data

Yiliang Zhang; Qi Long

Fairness guarantee in analysis of incomplete data

Yiliang Zhang, Qi Long

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Withdrawn SubmissionReaders: Everyone

Keywords: Algorithmic fairness, missing data analysis, domain adaptation

Abstract: Missing data are prevalent and present daunting challenges in real data analysis. While there is a growing body of literature on fairness in analysis of fully observed data, there has been little work on investigating fairness in analysis of incomplete data when the goal is to develop a fair algorithm in the complete data domain where there are no missing values. In practice, a popular analytical approach for dealing with missing data is to use only the set of complete cases, i.e., observations with all features fully observed, as a representation of complete data in learning. However, depending on the missing data mechanism, the complete case domain and the complete data domain may have different data distributions and a fair algorithm in the complete case domain may show disproportionate bias towards some marginalized groups in the complete data domain. To fill this significant gap, we studying the problem of estimating fairness in the complete data domain for a model trained using observed data and evaluated in the complete case domain. We provide upper and lower bounds on the fairness estimation error and conduct numerical experiments to assess our theoretical results. Our work provides the first known results on fairness guarantee in analysis of incomplete data.

One-sentence Summary: Our work provides the first known results on fairness guarantee when using incomplete data as representations in learning.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Reviewed Version (pdf): https://openreview.net/references/pdf?id=ioR_ALPMg0

14 Replies

Loading