RC2020 Report: Learning De-biased Representations with Biased RepresentationsDownload PDF

Jan 31, 2021 (edited Apr 01, 2021)ML Reproducibility Challenge 2020 Blind SubmissionReaders: Everyone
  • Keywords: Biases, De-bias, Learning Representations
  • Abstract: Scope of Reproducibility The authors formalize and attempt to tackle the so called "cross bias generalization" problem with a new approach they introduce called ReBias. This report contains results of our attempts at reproducing the work in the application area of Image Recognition, specifically on the datasets biased MNIST and ImageNet. We compare ReBias with other methods - Vanilla, Biased, RUBi (as implemented by the authors), and conclude with a discussion the validity of the claims made by the paper. Our reproducibility source code is available in the supplementary material we've uploaded with the revision Methodology We used the authors' code available at: {https://github.com/clovaai/rebias} and did not modify any part of it to make sure the initial conditions of the experiments were identical. We did not re-implement any part of the pipeline except for ImageNet, implementation issues of which we discuss later on. The total training time, with final results averaged over 3 runs, amounted to 128 hours on on a single NVIDIA GeForce GTX 1080 8GB GDDR5X with 2560 CUDA cores. Results We were able to reproduce results reported for the biased MNIST dataset to within 1% of the original values reported in the paper. Like the authors, we report results averaged over 3 runs. However, in a later section, we provide some additional results that appear to weaken the central claim of the paper. We were not able to reproduce results for ImageNet as in the original paper, but present our results and a further discussion . What was easy We found the reproducibility for the biased MNIST dataset to be easy. The code provided was clear to understand and ready to run as a simple terminal command. What was difficult Although the majority code used by the authors for the training pipleline was made publicly available, there were a few things we had to be careful about, including using the right version of the library torchvision (use latest version, version used by the authors is different), using the right WNIDs for scraping the 9-class ImageNet dataset (actual dataset used in training is not provided). We found it difficult to reproduce two of the methods ReBias is compared against HEX and Learned-Mixin, as they were the authors implementation of the methods (authors' implementation of HEX is not available). The most significant difficulty faced was attempting to avoid exploding gradients on one of the experimental settings, even though we used the exact same training pipeline. Other users should be careful so as to use the most current version of torchvision instead of the version mentioned in the requirements.txt file. Additionally, the original subset of ImageNet used in the paper was not provided, but the WNIDs were. We wrote our own script to scrape these images from the ILSVRC-2017 links. Finally, there were two methods [HEX and Learned-Mixin] whose results we found difficult to reproduce. We discuss the reasons in a later section. Communication with original authors We had some preliminary contact with the authors regarding the implementation of the methods discussed above. We were informed that the HEX implementation was exactly as in the original work, and that the implementations for RUBi and Learned-Mixin' had been adapted from the NLP domain into the vision domain. Further, the authors confirmed some details about the dataset and training which were previously not clear.
  • Paper Url: http://proceedings.mlr.press/v119/bahng20a.html
  • Supplementary Material: zip
3 Replies