Reproducibility Report: Your Classifier is Secretly an Energy Based Model and You Should Treat it Like OneDownload PDF

06 Dec 2020 (modified: 05 May 2023)ML Reproducibility Challenge 2020 Blind SubmissionReaders: Everyone
Abstract: Scope of Reproducibility: We validated the Joint Energy-Based model (JEM) training technique, recently developed by Grathwohl et al. Specifically, we checked performance on image classification, generation, and uncertainty calibration. Methodology: We re-implemented the paper's pipeline from scratch, based on the algorithm described in the paper. We only referred to the authors' code for subtleties such as data normalization which were not explicitly mentioned in the paper. Training JEM took about 12 hours using a Wide ResNet 28-2 architecture. Results: We verified that JEM performed similarly to how it was presented in \cite{Grathwohl:2019}. We could not reproduce the exact numerical results of the paper due to the long training time of the algorithm. What was easy: The paper is well-written, with the algorithm and motivation clearly explained. What was difficult: We were not able to reproduce the exact results of the paper since constraints on computation time forced us to use a smaller network than the authors. Running the authors' method with the model and hyperparameters they described in their original paper would have required about 80 hours to train on our hardware (which we believe is comparable to the authors' hardware), rather than the 36 hours the authors reported in the paper. While the training method produces a well-calibrated hybrid model, training itself is unstable. We needed to restart training a few times due to the loss diverging. Communication with original authors: We spoke to the authors who corroborated the second of the above difficulties, however the origin of the lengthy training time is less clear.
Paper Url: https://openreview.net/forum?id=Hkxzx0NtDB
4 Replies

Loading