Keywords: Deep learning, Learning theory, Learning Regime, Rich, Lazy
TL;DR: The paper offers explicit solutions for gradient flow in two-layer linear networks under various initializations, modeling the shift between lazy and rich learning, with applications to neuroscience and machine learning.
Abstract: Biological and artificial neural networks create internal representations for complex tasks. In artificial networks, the ability to form task-specific representations is shaped by datasets, architectures, initialization strategies, and optimization algorithms. Previous studies show that different initializations lead to either a lazy regime, where representations stay static, or a rich regime, where they evolve dynamically. This work examines how initialization affects learning dynamics in deep linear networks, deriving exact solutions for $\lambda$-balanced initializations, which reflect the weight scaling across layers. These solutions explain how representations and the Neural Tangent Kernel evolve from rich to lazy regimes, with implications for continual, reversal, and transfer learning in neuroscience and practical applications.
Is Neurips Submission: No
Submission Number: 8
Loading