Keywords: Normalized Steepest Descent, Wasserstein Space, Langevin dynamics
TL;DR: We develop a mean-field analysis for normalized steepest descent dynamics in Wasserstein space under the linear minimization oracle framework.
Abstract: In this paper, we develop a mean-field analysis of normalized steepest descent for two-layer neural networks in Wasserstein space. We first study the induced gradient flow in continuous time, establishing global convergence guarantees in both Euclidean and Wasserstein spaces. Our analysis reveals a fundamental distinction between normalized and unnormalized dynamics: while the latter exhibits linear convergence, the normalized flow reaches the optimum in finite time. We further extend the framework to finite-particle and discrete-time settings. We introduce LMO-driven Langevin dynamics and develop adaptive LMO particle schemes, establishing non-asymptotic convergence and stationarity guarantees. Collectively, our results provide a theoretical foundation for normalized steepest descent, a class of optimization methods that has recently become popular for training neural networks.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 40
Loading