[Re] On the Reproducibility of “FairCal: Fairness Calibration for Face Verification”

Marga Don; Satchit Chatterji; Milena Kapralova; Ryan Amaudruz

[Re] On the Reproducibility of “FairCal: Fairness Calibration for Face Verification”

Marga Don, Satchit Chatterji, Milena Kapralova, Ryan Amaudruz

Published: 02 Aug 2023, Last Modified: 02 Aug 2023MLRC 2022Readers: Everyone

Keywords: ReScience C, ReScience X, Machine Learning, Python, Reproducibility, Fairness, Face Verification, Calibration, Clustering, Bias Reduction

TL;DR: This paper aims to reproduce the study FairCal: Fairness Calibration for Face Verification by Salvador et al.

Abstract: Scope of Reproducibility — This paper aims to reproduce the study FairCal: Fairness Calibration for Face Verification by Salvador et al., focused on verifying three main claims: FairCal (introduced by the authors) achieves state‐of‐the‐art (i) global accuracy, (ii) fairness-calibrated probabilities and (iii) equality in false positive rates across sensitive attributes (i.e. predictive equality). The sensitive attribute taken into account is ethnicity. Methodology — Salvador et al. provide partial code via a GitHub repository. Additional code to generate image embeddings from three pretrained neural network models were based on existing repositories. All code was refactored to fit our needs, keeping extendability and readability in mind. Two datasets were used, namely, Balanced Faces in the Wild (BFW) and Racial Faces in the Wild (RFW). Additional experiments using Gaussian mixture models instead of K‐means clustering for FairCal validate the use of unsupervised clus‐ tering methods. The code was run on an AMD Ryzen 7 2700X CPU and NVIDIA GeForce GTX1080Ti GPU with a total runtime of around 3 hours for all experiments. Results — In most cases, we were able to reproduce results from the original paper to within 1 standard deviation, and observe similar trends. However, due to missing information about image pre‐processing, we were unable to reproduce the results exactly. What was easy — The original paper is clear and understandable. Furthermore, the authors provided a mostly working version of the code. Though the datasets are not freely available to the public, their authors supplied these to us swiftly after contacting them. What was difficult — While most of the code worked with slight changes, it was assumed there were files containing image embeddings available for both datasets, which the authors neither provided nor gave details about. We therefore pre‐processed and generated embeddings independent of the authors, which makes it more difficult to judge the overall reproducibility of their method. Additionally, we encountered difficulties while improving the efficiency and extendability of the code. Communication with original authors — We emailed the first author of the paper twice. First at the beginning of our undertaking, they were enthusiastic about our attempt, and clarified a few initial doubts about their implementation, the embeddings, and missing files. As per the writing of this paper, they have not responded to the second email.

Paper Url: https://openreview.net/pdf?id=nRj0NcmSuxb

Paper Review Url: https://openreview.net/forum?id=nRj0NcmSuxb

Paper Venue: ICLR 2022

Confirmation: The report pdf is generated from the provided camera ready Google Colab script, The report metadata is verified from the camera ready Google Colab script, The report contains correct author information., The report contains link to code and SWH metadata., The report follows the ReScience latex style guides as in the Reproducibility Report Template (https://paperswithcode.com/rc2022/registration)., The report contains the Reproducibility Summary in the first page., The latex .zip file is verified from the camera ready Google Colab script

Latex: zip

Journal: ReScience Volume 9 Issue 2 Article 15

Doi: https://www.doi.org/10.5281/zenodo.8173686

Code: https://archive.softwareheritage.org/swh:1:dir:875537f11cad3f77fcd8fc7b313d27605118a634

0 Replies

Loading