A Mechanism of Implicit Regularization in Deep LearningDownload PDF

25 Sep 2019 (modified: 24 Dec 2019)ICLR 2020 Conference Blind SubmissionReaders: Everyone
  • Original Pdf: pdf
  • Keywords: Implicit Regularization, Generalization, Deep Neural Network, Low Complexity
  • Abstract: Despite a lot of theoretical efforts, very little is known about mechanisms of implicit regularization by which the low complexity contributes to generalization in deep learning. In particular, causality between the generalization performance, implicit regularization and nonlinearity of activation functions is one of the basic mysteries of deep neural networks (DNNs). In this work, we introduce a novel technique for DNNs called random walk analysis and reveal a mechanism of the implicit regularization caused by nonlinearity of ReLU activation. Surprisingly, our theoretical results suggest that the learned DNNs interpolate almost linearly between data points, which leads to the low complexity solutions in the over-parameterized regime. As a result, we prove that stochastic gradient descent can learn a class of continuously differentiable functions with generalization bounds of the order of $O(n^{-2})$ ($n$: the number of samples). Furthermore, our analysis is independent of the kernel methods, including neural tangent kernels.
14 Replies