Keywords: transfer learning, Bayesian inference, informative priors, reproducibility report, machine learning
TL;DR: Reproducibility report for paper "Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Prior".
Abstract: REPRODUCIBILITY SUMMARY
Scope of Reproducibility
In this work, we study the reproducibility of the paper: Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Priors. The paper proposes a three-step pipeline for replacing standard transfer learning with a pre-trained prior. The first step is training a prior, the second is re-scaling of a prior, and the third is inference.
The authors claim that increasing the rank and the scaling factor improves performance on the downstream task. They also argue that using Bayesian learning with informative prior leads to a more data-efficient and improved performance compared to standard SGD transfer learning or using non-informative prior. We reproduce the main claims on one of the four data sets in the paper.
Methodology
We used a combination of the authors' and our code. The authors provided a training pipeline for the user but not the code to fully reproduce the paper. We modified the training pipeline to suit our needs and created a testing pipeline to evaluate the models. We reproduced the results for the Oxford-102-Flowers data set on an Nvidia RTX 3070 GPU using approximately 310 GPU hours for the main results.
Results
Our results confirm most of the claims tested, although we could not achieve the exact same accuracy due to missing hyper-parameters. We reproduced the trend in how scaling the prior impacts the performance and how a learned prior outperforms a non-learned prior.
On contrary, we could not reproduce the effect of rank in low-rank covariance approximation on model performance, as well as the beneficial boost in performance of Bayesian learning compared to the standard SGD.
What was easy
The authors' implementation provides various training and logging parameters. It is also helpful that the authors provided both the learned priors and scripts for the download, split and pre-processing of the data sets used in the study.
What was difficult
Setting the environment for the used packages to work correctly was difficult. Although many parameters are available for running the pipeline, their descriptions are misguiding, therefore a lot of time went into clarifying the parameter function and debugging different settings. The training also took a while, especially when training 5 models per data point.
Communication with original authors
We contacted the authors via e-mail about their pipeline and their use of hyper-parameters but did not hear back.
Paper Url: https://openreview.net/forum?id=ao30zaT3YL
Paper Review Url: https://openreview.net/forum?id=ao30zaT3YL
Paper Venue: ICML 2022
Supplementary Material: zip
Confirmation: The report pdf is generated from the provided camera ready Google Colab script, The report metadata is verified from the camera ready Google Colab script, The report contains correct author information., The report contains link to code and SWH metadata., The report follows the ReScience latex style guides as in the Reproducibility Report Template (https://paperswithcode.com/rc2022/registration)., The report contains the Reproducibility Summary in the first page., The latex .zip file is verified from the camera ready Google Colab script
Latex: zip
Journal: ReScience Volume 9 Issue 2 Article 9
Doi: https://www.doi.org/10.5281/zenodo.8173668
Code: https://archive.softwareheritage.org/swh:1:dir:a67381741fc59350e147883d925d6246f60bb3ef
0 Replies
Loading