Analysis of Stochastic Gradient Descent for Learning Linear Neural NetworksDownload PDF

Published: 21 May 2023, Last Modified: 14 Jul 2023SampTA 2023 AbstractReaders: Everyone
Abstract: In this work we analyze stochastic gradient descent (SGD) for learning deep linear neural networks. We use an analytical approach that combines SGD iterates and gradient flow trajectories base on stochastic approximation theory. Then establish the almost sure boundedness of SGD iterates and its convergence guarantee for learning deep linear neural networks. Most studies on the analysis of SGD for nonconvex problem have entirely focused on convergence property which only indicate that the second moment of the loss function gradient tend to zero. Our study demonstrates the convergence of SGD to a critical point of the square loss almost surely for learning deep linear neural networks.
Submission Type: Abstract
0 Replies

Loading