Keywords: concentration inequalities, isoperimetry, robustness, stability, classification problems, generalization, overparameterization
TL;DR: We show that interpolating classifiers can only be stable, and thus generalize well, if they are sufficiently overparameterized.
Abstract: In this work, we show that class stability, the expected distance of an input to the decision boundary, captures what classical capacity measures, such as weight norms, fail to explain. In particular, we prove a generalization bound that improves inversely with the class stability. As a corollary, interpreting class stability as a quantifiable notion of robustness, we derive a law of robustness for classification that extends results by Bubeck and Selke beyond smoothness assumptions to discontinuous functions. Specifically, any interpolating model with $p \approx n$ parameters on $n$ data points must be unstable, implying that high stability requires substantial overparameterization. Preliminary experiments support this theory: empirical stability increases with model size, while traditional norm-based measures remain uninformative.
Student Paper: Yes
Submission Number: 100
Loading