Comparative Generalization Bounds for Deep Neural Networks

Tomer Galanti; Liane Galanti; Ido Ben-Shaul

Comparative Generalization Bounds for Deep Neural Networks

Tomer Galanti, Liane Galanti, Ido Ben-Shaul

Published: 26 May 2023, Last Modified: 17 Sept 2024Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: In this work, we investigate the generalization capabilities of deep neural networks. We introduce a novel measure of the effective depth of neural networks, defined as the first layer at which sample embeddings are separable using the nearest-class center classifier. Our empirical results demonstrate that, in standard classification settings, neural networks trained using Stochastic Gradient Descent (SGD) tend to have small effective depths. We also explore the relationship between effective depth, the complexity of the training dataset, and generalization. For instance, we find that the effective depth of a trained neural network increases as the proportion of random labels in the data rises. Finally, we derive a generalization bound by comparing the effective depth of a network with the minimal depth required to fit the same dataset with partially corrupted labels. This bound provides non-vacuous predictions of test performance and is found to be empirically independent of the actual depth of the network.

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: Following the reviews, we added the following: 1. We conducted experiments for validating the assumptions made in the analysis. 2. We extended the discussion around the related work. 3. We added a table comparing our results with additional papers suggested by reviewer rPZ8. 4. We incorporated several clarifications and suggestions made by the reviewers. In addition to the changes made in the previous version, we have incorporated the following revisions in the camera-ready version: 1. To strengthen our theoretical analysis, we have included Propositions 1, 2, and 5, in order to theoretically support the assumption made in Equation 3. 2. To improve clarity, we made some corrections to the text. 3. Additionally, we reviewed the bibliography to ensure that all papers are cited accurately and consistently.

Supplementary Material: zip

Assigned Action Editor: ~Yunhe_Wang1

License: Creative Commons Attribution 4.0 International (CC BY 4.0)

Submission Number: 881

Loading