Keywords: Generalization bound, covering numbers, compositionality
TL;DR: A new generalization bound for DNNs that illustrates how DNNs can leverage compositionality to break the curse of dimensionality.
Abstract: We show that deep neural networks (DNNs) can efficiently learn any
composition of functions with bounded $F_{1}$-norm, which allows
DNNs to break the curse of dimensionality in ways that shallow networks
cannot. More specifically, we derive a generalization bound that combines
a covering number argument for compositionality, and the $F_{1}$-norm
(or the related Barron norm) for large width adaptivity. We show that
the global minimizer of the regularized loss of DNNs can fit for example
the composition of two functions $f^{\*}=h\circ g$ from a small number
of observations, assuming $g$ is smooth/regular and reduces the dimensionality
(e.g. $g$ could be the quotient map of the symmetries of $f^{*}$),
so that $h$ can be learned in spite of its low regularity. The measures
of regularity we consider is the Sobolev norm with different levels
of differentiability, which is well adapted to the $F_{1}$ norm.
We compute scaling laws empirically and observe phase transitions
depending on whether $g$ or $h$ is harder to learn, as predicted
by our theory.
Primary Area: learning theory
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 5174
Loading