[Reproducibility Report] SAdam: A Variant of Adam for Strongly Convex Functions

Yash Raj Sarrof; Narayanan Elavathur Ranganatha

[Reproducibility Report] SAdam: A Variant of Adam for Strongly Convex Functions

Yash Raj Sarrof, Narayanan Elavathur Ranganatha

31 Jan 2021 (modified: 05 May 2023)ML Reproducibility Challenge 2020 Blind SubmissionReaders: Everyone

Keywords: Online convex optimization, Adaptive online learning, Adam, SAdam

Abstract: We investigate "SAdam: A Variant of Adam for Strongly Convex Functions" by Guanghui Wang et al. from the perspective of reproducibility. We concentrate on verifying the main empirical claims of the paper, and to ensure this verification is free of bias, we use a different framework (PyTorch) as compared to the authors (Tensorflow) and write the source code from scratch without referring to the already existing source code. Scope of Reproducibility The central claim of the original paper is that a modification of Adam to utilize strong convexity yields faster convergence owing to a data dependent logarithmic regret bound, much akin to the recently proposed SC-RMSProp. Additionally the authors also claim that the higher performance is sustained even in non-convex optimization problems. We run the same experiments as the ones listed in the original paper, trying to reproduce the observations using a different framework (PyTorch) as compared to the authors (TensorFlow) . Methodology We did not use any of the source code provided, preferring to implement everything from the scratch, however we did run the source code to verify the results, using the code provided itself. Although we did vary the hyperparameters to see the difference on the results obtained, the values mentioned in the original paper, proved to work the best. We used the NVIDIA Tesla K80 GPU provided on the Google Colab platform to perform all our experiments. Results All the claims of the paper, are in line with our empirical observations as well, and our results are within a range of 1-2 \% of accuracy. For one of the experiment listed in the appendix of the original paper involving training of ResNet-18 with all of the optimizers, our results deviate. However, the same does not discredit any of the claims made by the original paper, as that additional experiment merely was shown to prove an additional benefit of using SAdam in the original paper. What was easy All experiments were easy to setup, from procuring all the 3 datasets to the selection of the hyperparameters. In addition, none of the experiments had a high runtime requirement, hence no additional hardware was required to conduct the experiments and training on Google Colab was sufficient. What was difficult Implementing the optimizers against which SAdam was to be compared took up a lot of time and compounded the work at hand. Besides that a few contradictions in the proposed values of the mini-batch size by the authors of SC-RMSProp and SAdam caused a delay in the completion of the reproducibility study. Communication with original authors The authors shared their codebase with us in our very first correspondence. We reached out to them on multiple occasions for clarifications, and they were always prompt in answering our doubts.

Paper Url: https://openreview.net/forum?id=zOw92TqIvAH&noteId=Vr_r9wdS9t&referrer=%5BML%20Reproducibility%20Challenge%202020%5D(%2Fgroup%3Fid%3DML_Reproducibility_Challenge%2F2020)

Supplementary Material: zip

8 Replies

Loading