- Abstract: We show that if the usual training loss is augmented by a Lipschitz regularization term, then the networks generalize. We prove generalization by first establishing a stronger convergence result, along with a rate of convergence. A second result resolves a question posed in Zhang et al. (2016): how can a model distinguish between the case of clean labels, and randomized labels? Our answer is that Lipschitz regularization using the Lipschitz constant of the clean data makes this distinction. In this case, the model learns a different function which we hypothesize correctly fails to learn the dirty labels.
- Keywords: Deep Neural Networks, Regularization, Generalization, Convergence, Lipschitz, Stability
- TL;DR: We prove generalization of DNNs by adding a Lipschitz regularization term to the training loss. We resolve a question posed in Zhang et al. (2016).