Asymptotic Analysis of Conditioned Stochastic Gradient Descent

Published: 16 Aug 2023, Last Modified: 16 Aug 2023Accepted by TMLREveryoneRevisionsBibTeX
Abstract: In this paper, we investigate a general class of stochastic gradient descent (SGD) algorithms, called $\textit{conditioned}$ SGD, based on a preconditioning of the gradient direction. Using a discrete-time approach with martingale tools, we establish under mild assumptions the weak convergence of the rescaled sequence of iterates for a broad class of conditioning matrices including stochastic first-order and second-order methods. Almost sure convergence results, which may be of independent interest, are also presented. Interestingly, the asymptotic normality result consists in a stochastic equicontinuity property so when the conditioning matrix is an estimate of the inverse Hessian, the algorithm is asymptotically optimal.
License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: Upload camera-ready version of the paper.
Assigned Action Editor: ~Simon_Lacoste-Julien1
Submission Number: 1193