Reimplementing the Adversarially Reweighted Learning model by Lahoti et al. (2020) to improve fairness without demographics
Keywords: Fairness, ARL, Demographics
Abstract: Scope of Reproducibility
It is often the case in Machine Learning systems that the used data does not contain protected group membership due to privacy rules and regulations. This makes it difficult to improve fairness for disadvantaged subgroups. As a solution, Lahoti et al. propose Adversarially Reweighted Learning (ARL) \citep{lahoti2020fairness}. They claim that ARL significantly improves fairness for computationally identifiable subgroups.
Methodology
In this project we aimed to reproduce, replicate, and evaluate the results presented by Lahoti et al. First, the open-source TensorFlow implementation of the ARL model was used to test the reproducibility of the results. Second, the ARL model was re-implemented in PyTorch to test the replicability of the results. Finally, the significance of the ARL model was tested against a baseline model using P-value tests. We trained and evaluated the models in about half a minute per model iteration (i.e. for fully training and evaluating the model) on a 2,3 GHz 8-Core Intel Core i9 processor. 
% A GPU of the Lisa Compute Cluster was used for an exhaustive 48 hour hyperparameter grid-search. 
Results
Our findings suggest that (1) the paper is not reproducibile, (2) the paper is replicable, yet (3) the results are not significant. The main results were reproduced within 2$\%$ of the reported values. However, with limited knowledge of the original hyperparameters used and the inability to produce several additional metrics presented in the paper we concluded the paper to not be reproducible. The PyTorch implementation produced results within 1$\%$ of the reported values, suggesting that the paper is replicable. However, the results proved to be insignificant when compared to a baseline model.
What was easy
The paper by Lahoti et al. was concise and clearly structured. 
This, in combination with the well documented open-source TensorFlow implementation, provided us with clear guidance when re-implementing the ARL model in PyTorch. 
What was difficult
Pre-processing the data proved difficult. In addition, some details regarding the model were not mentioned in the paper. Therefore, we had to make some impactful assumptions about e.g. the amount of training steps, and the original hyperparameters used. 
Communication with original authors
The authors were contacted by email about some missing details in their paper. However, we did not receive a response.
Paper Url: https://openreview.net/forum?id=SiHVX35sDT
4 Replies
Loading