Keywords: Deep learning, Learning theory
Abstract: Architectural diversity shapes learning in both biological and artificial systems, influencing how features are represented and generalized. Motivated by parallels between brain circuitry and machine learning architectures, we study the exact learning dynamics of wide and narrow two-layer linear networks, extending the class of solvable models to encompass a broader range of architectural configurations. Our framework captures how depth, width, and bottleneck structures affect feature learning. This approach has the potential to advances theoretical understanding of network architectures relevant to neuroscience—such as cerebellum-like structures—but also informs practical designs in modern machine learning, including low-dimensional bottlenecking in Low-Rank Adaptation (LoRA) modules.
Submission Number: 8
Loading