Keywords: Deep Learning, Theory, Emergence, Scaling law, Statistical Physics, Spectral Methods, High-Dimension Statistics, Random Matrix
TL;DR: We develop Neural Low-Degree Filtering, an analyzable spectral theory that explains key aspects of deep learning: emergence, layerwise composition, task-adaptive representations, and the advantage of depth.
Abstract: Understanding how deep neural networks learn useful internal representations from data remains a central open problem in the theory of deep learning. We introduce \emph{Neural Low-Degree Filtering} (Neural LoFi), a stylized limit of gradient-based training in which hierarchical feature learning becomes an explicit iterative spectral procedure. In this limit, the dynamics at each layer decouple: given the current representation, the next layer selects directions with maximal accessible low-degree correlation to the label. This yields a tractable surrogate mechanism for deep learning, together with a natural kernel-space interpretation. Neural LoFi provides a mathematically explicit framework for studying feature learning beyond the lazy regime. It predicts how representations are selected layer by layer, and gives a concrete mechanism by which depth progressively constructs new features from old ones. We complement the theory with mechanistic experiments on fully connected and convolutional architectures, showing that Neural LoFi improves over random-feature baselines, recovers meaningful structured filters, and predicts representations aligned with early gradient-descent feature discovery.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 34
Loading