Published: 01 Jan 2023, Last Modified: 01 Oct 2023ICML 2023Readers: Everyone
Abstract:We propose a tuning-free dynamic SGD step size formula, which we call Distance over Gradients (DoG). The DoG step sizes depend on simple empirical quantities (distance from the initial point and no...