Keywords: fairness, bias, face verification, faircal, oracle, predictive equality, fairly calibrated
Abstract: Reproducibility Summary
===
Scope of Reproducibility
---
This reproducibility paper verifies the claim by Salvador et al. in “FairCal: Fairness Calibration for Face Verification” [1] that the FairCal and Oracle methods are fair with respect to sensitive attributes and obtain SOTA accuracy results in face verification when compared to FSN and FTC. The aim is to reproduce the relative values in Tables 2, 3 and 4 of the original paper for these methods. We also provide and empirically support an intuitive explanation of why FairCal outperforms Oracle.
Methodology
---
The authors provided partial code to create the results; Code to create and preprocess embeddings was missing, but code to run the experiments on these embeddings was provided. Nevertheless, we re‐implement the code from scratch, keeping the data structure identical. Hardware used are personal laptops without GPU and a desktop with an MSI GeForce GTX 1060‐3GB GPU.
Results
---
Compared to the data reported in the original paper, the reproduced results vary across embedding models and evaluation metrics, where some combinations perform very similarly to the original paper while other combinations deviate significantly. Despite this, the claims of the original paper have been confirmed, which include no loss of accuracy, fairly calibrated subgroups and predictive equality.
What was easy
---
Some parts of the reproduction went smoothly such as the accessibility of the data and models and the quick execution of the experiments. Furthermore, the paper was clear about evaluation metrics. Finally, code for the figures worked straight out of the box.
What was difficult
---
The exact steps of the original implementation were unclear to us because the provided code had few comments and its structure was not immediately obvious. Additionally, obtaining and correctly running the ArcFace model from its ONNX file was not successful because we never worked with ONNX and initially downloaded a broken instance.
Communication with original authors
---
We had indirect contact with the first author who provided an example of the required metadata structure and clarified that all unmentioned hyperparameters were kept at their default values.
[1]: T. Salvador, S. Cairns, V. Voleti, N. Marshall, and A. M. Oberman. “FairCal: Fairness Calibration for Face Verification.” In: International Conference on Learning Representations. 2022.
Paper Url: https://arxiv.org/abs/2106.03761v4
Paper Review Url: https://openreview.net/forum?id=nRj0NcmSuxb
Paper Venue: ICLR 2022
Confirmation: The report pdf is generated from the provided camera ready Google Colab script, The report metadata is verified from the camera ready Google Colab script, The report contains correct author information., The report contains link to code and SWH metadata., The report follows the ReScience latex style guides as in the Reproducibility Report Template (https://paperswithcode.com/rc2022/registration)., The report contains the Reproducibility Summary in the first page., The latex .zip file is verified from the camera ready Google Colab script
TL;DR: We confirm that Faircal provides SOTA fairness guarantees
Latex: zip
Journal: ReScience Volume 9 Issue 2 Article 28
Doi: https://www.doi.org/10.5281/zenodo.8173719
Code: https://archive.softwareheritage.org/swh:1:dir:95f2895ab0761e8a2341dc83e26cdcbbc5a0ecde
0 Replies
Loading