From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks

Clémentine Carla Juliette Dominé; Nicolas Anguita; Alexandra Maria Proca; Lukas Braun; Daniel Kunin; Pedro A. M. Mediano; Andrew M Saxe

From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks

Clémentine Carla Juliette Dominé, Nicolas Anguita, Alexandra Maria Proca, Lukas Braun, Daniel Kunin, Pedro A. M. Mediano, Andrew M Saxe

Published: 11 Oct 2024, Last Modified: 10 Nov 2024M3L PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Deep learning, Learning theory, Learning Regime, Rich, Lazy

TL;DR: The paper offers explicit solutions for gradient flow in two-layer linear networks under various initializations, modeling the shift between lazy and rich learning, with applications to neuroscience and machine learning.

Abstract: Biological and artificial neural networks create internal representations for complex tasks. In artificial networks, the ability to form task-specific representations is shaped by datasets, architectures, initialization strategies, and optimization algorithms. Previous studies show that different initializations lead to either a lazy regime, where representations stay static, or a rich regime, where they evolve dynamically. This work examines how initialization affects learning dynamics in deep linear networks, deriving exact solutions for $\lambda$-balanced initializations, which reflect the weight scaling across layers. These solutions explain how representations and the Neural Tangent Kernel evolve from rich to lazy regimes, with implications for continual, reversal, and transfer learning in neuroscience and practical applications.

Is Neurips Submission: No

Submission Number: 8

Loading