Auditor Fairness Evaluation via Learning Latent Assessment Models from Elicited Human Feedback

TMLR Paper79 Authors

06 May 2022 (modified: 28 Feb 2023)Rejected by TMLREveryoneRevisionsBibTeX
Abstract: Algorithmic fairness literature presents numerous mathematical notions and metrics, and also points to a tradeoff between them while satisficing some/all of them simultaneously. Furthermore, the contextual nature of fairness notions makes it difficult to automate bias evaluation in diverse algorithmic systems. Therefore, in this paper, we propose a novel model called latent assessment model (LAM) to characterize binary feedback provided by human auditors, by assuming that the auditor compares the classifier’s output to his/her own intrinsic judgment for each input. We prove that individual and/or group fairness notions are guaranteed as long as the auditor’s intrinsic judgments inherently satisfy the fairness notion at hand, and are relatively similar to the classifier's evaluations. We also demonstrate this relationship between LAM and traditional fairness notions on three well-known datasets, namely COMPAS, German credit and Adult Census Income datasets. Furthermore, we also derive the minimum number of feedback samples needed to obtain probably approximately correct (PAC) learning guarantees to estimate LAM for black-box classifiers. Moreover, we propose a novel multi-attribute reputation measure to evaluate auditor's preference towards various fairness notions as well as sensitive groups. These guarantees are also validated using standard machine learning algorithms, which are trained on real binary feedback elicited from 400 human auditors regarding COMPAS.
Submission Length: Long submission (more than 12 pages of main content)
Changes Since Last Submission: Our response to Reviewer MQ16's comments and the corresponding changes in the original paper will be highlighted in RED color. Our response to Reviewer Syg5's comments and the corresponding changes in the original paper will be highlighted in BLUE color. Our response to Reviewer p9qv's comments and the corresponding changes in the original paper will be highlighted in CYAN color. -- Improved our introduction and motivation sections -- Discussed crowd worker reliability in Related work -- Included Definition 2 on direct and indirect fairness measures -- Included Definitions 3 and 4 on PAC learnability of the auditor and the system respectively. -- Added Theorem 2 demonstrating the tolerable gap between direct and indirect fairness measure. -- Added Figures 5, 6, 7 and 8 showing the trends of accuracy and absolute error for smaller training set samples. -- Added Figure 12 which represents the reputation matrices of the simulated auditor.
Assigned Action Editor: ~Alexandra_Chouldechova1
Submission Number: 79
Loading