Langevin Learning Dynamics in Lazy and Non-Lazy Wide Neural Networks

Published: 09 Jun 2025, Last Modified: 09 Jun 2025HiLD at ICML 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Learning Dynamics, Langevin Dynamics, Statistical Mechanics, Stochastic differential equations, High dimensional dynamics, infinite width regime, lazy learning, non-lazy learning
TL;DR: We extend current theories to compare Langevin dynamics in Deep Neural Networks in the lazy and non-lazy regimes.
Abstract: Langevin dynamics—gradient descent with additive stochastic noise—provides a powerful framework for learning dynamics in deep neural networks, bridging deterministic optimization and statistical inference in deep neural networks. It has been shown to unify two prominent theories for wide networks: the Neural Tangent Kernel (NTK), which assumes linearized gradient descent dynamics, and the Bayesian Neural Network Gaussian Process (NNGP), which treats learning as posterior inference. In this work, we extend the framework to compare lazy and non-lazy learning in linear networks, analyzing how different parameters affect the learning dynamics of both the predictor and the kernel in each regime. We show that in the non-lazy case, the network is more resilient to noise and to small initial condition.
Student Paper: Yes
Submission Number: 84
Loading