Replication Study of "Fairness and Bias in Online Selection"Download PDF

Published: 11 Apr 2022, Last Modified: 05 May 2023RC2021Readers: Everyone
TL;DR: Replication study of the paper Fairness and Bias in Online Selection (2021) by Correa et al.
Abstract: Scope of Reproducibility In this paper, we work on reproducing the results obtained in the 'Fairness and Bias in Online Selection' paper. The goal of the reproduction study is to validate the 4 main claims made in the original paper. The claims made are: (1) for the multi-color secretary problem, an optimal online algorithm is fair, (2) for the multi-color secretary problem, an optimal offline algorithm is unfair, (3) for the multi-color prophet problem, an optimal online algorithm is fair (4) for the multi-color prophet problem, an optimal online algorithm is less efficient relative to the offline algorithm. To test if the results of the secretary algorithm generalize to other data sets, the proposed algorithms and baselines are applied to the UFRGS Entrance Exam and GPA data set. Methodology The paper that has been reproduced includes a link to a repository containing C++ files for the algorithms that were implemented. For our experiments, we reimplemented the code in \textit{Python}. Our goal was to reproduce the code in an efficient manner without altering the core logic. Using the Python code all the experiments in the paper have been replicated including some additional experiments to verify the claims made in the original paper Results The reproduced results support all claims made in the original paper. However, in the case of the unfair secretary algorithm (SA), some irregular results arise in the experiments due to randomness. This irregularity is also existent in the original code. What was easy The concepts behind the algorithms were straightforward. The existing code base provided a solid reference point to verify the results of the original paper by compiling and running the provided code. What was difficult Implementing the prophet algorithm, in comparison to the secretary algorithm, was complex. \textit{C++} is a more efficient compiler (time complexity, etc.) compared to Python. For the reproduction of the algorithms, this needed to be taken into account. While it might be possible to execute transliterated code on a powerful machine, with the available resources the code would have taken over 96 hours to run. In order to tackle this problem, some of the data structures needed to be converted to \textit{NumPy} arrays to decrease computation time.
Paper Url:
Paper Venue: ICML 2021
Supplementary Material: zip
0 Replies