Published: 01 Jan 2021, Last Modified: 28 Apr 2023ICML 2021Readers: Everyone
Abstract:Stochastic Gradient Descent (SGD) is a popular tool in training large-scale machine learning models. Its performance, however, is highly variable, depending crucially on the choice of the step size...