The Scaling Laws of Classification Networks: Insights from Adaptive Exact Average Density Approximation

11 Mar 2026 (modified: 04 May 2026)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Our main goal is to establish a generalization bound for classification tasks that aligns with the empirical scaling laws observed in deep neural networks (DNNs). Under the assumption that the boundary of the target classification function is a semi-algebraic set, we show that the generalization error bound can follow scaling laws for large networks. The rate of scaling with respect to sample size is intrinsically linked to the effective dimension of the data manifold, independent of the specific network model or learning algorithm applied. In contrast, the scaling law with respect to the number of parameters varies across learning methods and network architectures. This variability in the parameter scaling law can be quantified by the notion of ``box-module dimension, '' which measures how the number of model parameters grows as the radius of covering balls decreases, capturing the complexity of the target classification boundary. Using this scaling law, we empirically demonstrate the feasibility of predicting the generalization errors of larger models from those of smaller models.
Submission Type: Regular submission (no more than 12 pages of main content)
Previous TMLR Submission Url: https://openreview.net/forum?id=3XcAqg4CWZ
Changes Since Last Submission: Rejection reasons: Highly incomplete openreview author profile prohibiting accurate conflict of interest determination. Action: I have updated all the required fields in the profile.
Assigned Action Editor: ~Wuyang_Chen1
Submission Number: 7883
Loading