[Re] On the reproducibility of "CrossWalk: Fairness-Enhanced Node Representation Learning"Download PDF

Published: 02 Aug 2023, Last Modified: 02 Aug 2023MLRC 2022Readers: Everyone
Keywords: rescience c, deepwalk, crosswalk, graph, node-embeddings, fairwalk, fairness, pytorch
TL;DR: Reproducibility study of "CrossWalk: Fairness-Enhanced Node Representation Learning".
Abstract: Scope of Reproducibility - The original authors present CrossWalk, an edge-reweighting algorithm which can be used in conjunction with random walk based node representation learning methods. We validate their claims of CrossWalk being characterized by a fairness-enhancing property, meaning it significantly reduces disparity, a measure of group fairness, and performance-conserving property, meaning it has an insignificant effect on task performance. Methodology - To perform a robust validation of the original authors' claims, we develop an independent, highly-modular code-base with complete re-implementation of the original experiments. Our design enables its use by other researchers in the future to easily run ablation experiments with different datasets, experiments, or even algorithms that can be employed in conjunction with CrossWalk. Furthermore, we create an accessible implementation of CrossWalk itself under the MIT license. Results - Our results provide solid evidence in favor of the performance-conserving property of CrossWalk. However, we find inconclusive evidence of the fairness-enhancing property of CrossWalk, mostly due to large variation in the reproduced disparity values. On the other hand, we find additional evidence in its favor by performing an experiment portraying the influence of the hyperparameters of CrossWalk. What was easy - The original authors provide a code-base implementing their methodology, which greatly helped us in understanding the material. Furthermore, their methodology is very modular, meaning we could test most parts of the pipeline independently. What was difficult - The original work contains discrepancies between specification of CrossWalk in the formulas, the pseudo-code, and the code-base. Also, we were unable to reproduce results for one of the datasets because of missing data. Finally, the original implementation is inadequately documented and its execution required numerous manual steps which were non-trivial and time consuming. Communication with original authors - To clarify some details regarding the original implementation and its structure, we reached out to the authors when beginning to reproduce their work. The authors were quick to respond and answered all of our questions.
Paper Url: https://ojs.aaai.org/index.php/AAAI/article/view/21454
Paper Venue: AAAI 2022
Confirmation: The report pdf is generated from the provided camera ready Google Colab script, The report metadata is verified from the camera ready Google Colab script, The report contains correct author information., The report contains link to code and SWH metadata., The report follows the ReScience latex style guides as in the Reproducibility Report Template (https://paperswithcode.com/rc2022/registration)., The report contains the Reproducibility Summary in the first page., The latex .zip file is verified from the camera ready Google Colab script
Latex: zip
Journal: ReScience Volume 9 Issue 2 Article 27
Doi: https://www.doi.org/10.5281/zenodo.8173717
Code: https://archive.softwareheritage.org/swh:1:dir:0010ab17932c3abd9cb892f1b92da408df43689c
0 Replies