Generalizability of Neural Networks Minimizing Empirical Risk Based on Expressive Power

Lijia Yu; Yibo Miao; Yifan Zhu; Xiao-Shan Gao; Lijun Zhang

Generalizability of Neural Networks Minimizing Empirical Risk Based on Expressive Power

Lijia Yu, Yibo Miao, Yifan Zhu, Xiao-Shan Gao, Lijun Zhang

Published: 22 Jan 2025, Last Modified: 13 May 2025ICLR 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: generalization bound, expressive power

TL;DR: We establish the generalization bound based on the expressive power for the network which minimize the Empirical Risk.

Abstract: The primary objective of learning methods is generalization. Classic generalization bounds, based on VC-dimension or Rademacher complexity, are uniformly applicable to all networks in the hypothesis space. On the other hand, algorithm-dependent generalization bounds, like stability bounds, address more practical scenarios and provide generalization conditions for neural networks trained using SGD. However, these bounds often rely on strict assumptions, such as the NTK hypothesis or convexity of the empirical loss, which are typically not met by neural networks. In order to establish generalizability under less stringent assumptions, this paper investigates generalizability of neural networks that minimize the empirical risk. A lower bound for population accuracy is established based on the expressiveness of these networks, which indicates that with adequately large training sample and network sizes, these networks can generalize effectively. Additionally, we provide a lower bound necessary for generalization, demonstrating that, for certain data distributions, the quantity of data required to ensure generalization exceeds the network size needed to represent that distribution. Finally, we provide theoretical insights into several phenomena in deep learning, including robust overfitting, importance of over-parameterization networks, and effects of loss functions.

Primary Area: learning theory

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 2381

Loading