[Re] Reproducibility study of ”Focus On The Common Good: Group Distributional Robustness Follows”

Walter Simoncini; Ioanna Gogou; Marta Freixo Lopes; Ron Kremer

[Re] Reproducibility study of ”Focus On The Common Good: Group Distributional Robustness Follows”

Walter Simoncini, Ioanna Gogou, Marta Freixo Lopes, Ron Kremer

Published: 02 Aug 2023, Last Modified: 02 Aug 2023MLRC 2022Readers: Everyone

Keywords: rescience c, machine learning, deep learning, python, pytorch, robustness

TL;DR: We successfully reproduced the paper findings and demonstrated that Common Gradient Descent, a solution to sub-population shifts and spurious correlation, has equal or better performance in comparison to Group-DRO and ERM

Abstract: Scope of Reproducibility This paper attempts to reproduce the main claims of Focus On The Common Good: Group Distributional Robustness Follows by Piratla et al., which introduces Common Gradient Descent (CGD), a novel optimization algorithm for handling spurious correlations and sub-population shifts. We have identified three central claims: (I) CGD is more robust than Group-DRO and leads to the largest average loss decrease across all groups (II) CGD generalizes better across all groups in comparison to ERM, and (III) CGD monotonically decreases the group-average loss. Methodology The experiments of this paper are based on the open source implementation of CGD released by the authors, which required some modifications to work with the latest version of the WILDS framework. Results The results from our experiments were overall in line with the paper. We show that CGD outperforms Group-DRO on synthetic datasets with induced spurious correlations, but the benefits of CGD are not clear in a real-world setting. Beyond the results of the original paper, our attempt to empirically verify the mathematical proof of the authors that CGD monotonically decreases the loss was not conclusive. What was easy The implementation from the original paper was available on GitHub with detailed instructions provided in the documentation. It was also relatively easy to introduce additional datasets and algorithms to the WILDS codebase. What was difficult The CGD implementation and several experiments could not be run out-of-the-box and required major modifications to run with the latest version of WILDS. The majority of the hyperparameter values for the experiments were not clearly stated. Lastly, the experiments were computationally expensive and required 440 GPU hours. Communication with original authors We reached out to the original authors to request additional information about the hyperparameter values and the implementation of some experiments. The authors promptly responded with sources for the hyperparameters, useful information about WILDS and provided some missing parts of the code. Overall, the communications were timely and effective.

Paper Url: https://openreview.net/forum?id=irARV_2VFs4

Paper Review Url: https://openreview.net/forum?id=irARV_2VFs4

Paper Venue: ICLR 2022

Supplementary Material: zip

Confirmation: The report pdf is generated from the provided camera ready Google Colab script, The report metadata is verified from the camera ready Google Colab script, The report contains correct author information., The report contains link to code and SWH metadata., The report follows the ReScience latex style guides as in the Reproducibility Report Template (https://paperswithcode.com/rc2022/registration)., The report contains the Reproducibility Summary in the first page., The latex .zip file is verified from the camera ready Google Colab script

Latex: zip

Journal: ReScience Volume 9 Issue 2 Article 23

Doi: https://www.doi.org/10.5281/zenodo.8173707

Code: https://archive.softwareheritage.org/swh:1:dir:4a89288fc050158c419caee05af572cad7b71a12

0 Replies

Loading