Exact Learning Dynamics of Bottlenecked and Wide Deep Linear Networks

Published: 23 Sept 2025, Last Modified: 17 Nov 2025UniReps2025EveryoneRevisionsBibTeXCC BY 4.0
Track: Extended Abstract Track
Keywords: Deep learning, Learning theory, Learning Regime, Rich, Lazy
TL;DR: The paper offers explicit solutions for gradient flow in two-layer linear bottlenecked and wide networks under various initializations.
Abstract: Architectural diversity shapes learning in both biological and artificial systems, influencing how features are represented and generalized. Motivated by parallels between brain circuitry and machine learning architectures, we study the exact learning dynamics of wide and narrow two-layer linear networks, extending the class of solvable models to encompass a broader range of architectural configurations. Our framework captures how depth, width, and bottleneck structures affect feature learning. This approach has the potential to advance theoretical understanding of network architectures relevant to neuroscience—such as cerebellum-like structures—but also informs practical designs in modern machine learning, including in Low-Rank Adaptation (LoRA) modules.
Submission Number: 24
Loading