Keywords: rescience c, machine learning, deep learning, python, pytorch, robustness
TL;DR: We successfully reproduced the paper findings and demonstrated that Common Gradient Descent, a solution to sub-population shifts and spurious correlation, has equal or better performance in comparison to Group-DRO and ERM
Abstract: Scope of Reproducibility
This paper attempts to reproduce the main claims of Focus On The Common Good: Group Distributional Robustness Follows by Piratla et al., which introduces Common Gradient Descent (CGD), a novel optimization algorithm for handling spurious correlations and sub-population shifts. We have identified three central claims: (I) CGD is more robust than Group-DRO and leads to the largest average loss decrease across all groups (II) CGD generalizes better across all groups in comparison to ERM, and (III) CGD monotonically decreases the group-average loss.
Methodology
The experiments of this paper are based on the open source implementation of CGD released by the authors, which required some modifications to work with the latest version of the WILDS framework.
Results
The results from our experiments were overall in line with the paper. We show that CGD outperforms Group-DRO on synthetic datasets with induced spurious correlations, but the benefits of CGD are not clear in a real-world setting. Beyond the results of the original paper, our attempt to empirically verify the mathematical proof of the authors that CGD monotonically decreases the loss was not conclusive.
What was easy
The implementation from the original paper was available on GitHub with detailed instructions provided in the documentation. It was also relatively easy to introduce additional datasets and algorithms to the WILDS codebase.
What was difficult
The CGD implementation and several experiments could not be run out-of-the-box and required major modifications to run with the latest version of WILDS. The majority of the hyperparameter values for the experiments were not clearly stated. Lastly, the experiments were computationally expensive and required 440 GPU hours.
Communication with original authors
We reached out to the original authors to request additional information about the hyperparameter values and the implementation of some experiments. The authors promptly responded with sources for the hyperparameters, useful information about WILDS and provided some missing parts of the code. Overall, the communications were timely and effective.
Paper Url: https://openreview.net/forum?id=irARV_2VFs4
Paper Review Url: https://openreview.net/forum?id=irARV_2VFs4
Paper Venue: ICLR 2022
Supplementary Material: zip
Confirmation: The report pdf is generated from the provided camera ready Google Colab script, The report metadata is verified from the camera ready Google Colab script, The report contains correct author information., The report contains link to code and SWH metadata., The report follows the ReScience latex style guides as in the Reproducibility Report Template (https://paperswithcode.com/rc2022/registration)., The report contains the Reproducibility Summary in the first page., The latex .zip file is verified from the camera ready Google Colab script
Latex: zip
Journal: ReScience Volume 9 Issue 2 Article 23
Doi: https://www.doi.org/10.5281/zenodo.8173707
Code: https://archive.softwareheritage.org/swh:1:dir:4a89288fc050158c419caee05af572cad7b71a12
0 Replies
Loading