Pointwise Generalization in Deep Neural Networks

19 Sept 2025 (modified: 17 Dec 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Deep Neural Networks, Nature of Generalization, Pointwise Riemannian Dimension, Feature Learning, Finite-Scale Geometry, Avoid NTK and Exponential Norm Barriers
TL;DR: Complete generalization theory for fully connected deep nets: bounds depend on the effective rank of learned features at trained model, and are empirically orders of magnitude tighter.
Abstract: We address the fundamental question of why deep neural networks generalize by establishing a pointwise generalization theory for fully connected networks. For each trained model, we characterize the hypothesis via a pointwise Riemannian Dimension, derived from the eigenvalues of the \textit{learned feature representations} across layers. This approach establishes a principled framework for deriving tight, hypothesis-dependent generalization bounds that accurately characterize the rich, nonlinear regime, systematically upgrading over approaches based on model size, products of norms, and infinite-width linearizations, yielding guarantees that are orders of magnitude tighter in both theory and experiment. Analytically, we identify the structural properties and mathematical principles that explain the tractability of deep networks. Empirically, the pointwise Riemannian Dimension exhibits substantial feature compression, decreases with increased over-parameterization, and captures the implicit bias of optimizers. Taken together, our results indicate that deep networks are mathematically tractable in practical regimes and that their generalization is sharply explained by pointwise, spectrum-aware complexity.
Primary Area: learning theory
Submission Number: 17867
Loading