Locality defeats the curse of dimensionality in convolutional teacher-student scenariosDownload PDF

May 21, 2021 (edited Oct 26, 2021)NeurIPS 2021 PosterReaders: Everyone
  • Keywords: Deep learning, Convolutional Neural Networks, Kernel Methods, Learning Curves
  • TL;DR: We compute the learning curves of convolutional kernels in a teacher-student framework. We find that locality is key to beat the curse of dimensionality, while translational invariance is not.
  • Abstract: Convolutional neural networks perform a local and translationally-invariant treatment of the data: quantifying which of these two aspects is central to their success remains a challenge. We study this problem within a teacher-student framework for kernel regression, using 'convolutional' kernels inspired by the neural tangent kernel of simple convolutional architectures of given filter size. Using heuristic methods from physics, we find in the ridgeless case that locality is key in determining the learning curve exponent $\beta$ (that relates the test error $\epsilon_t\sim P^{-\beta}$ to the size of the training set $P$), whereas translational invariance is not. In particular, if the filter size of the teacher $t$ is smaller than that of the student $s$, $\beta$ is a function of $s$ only and does not depend on the input dimension. We confirm our predictions on $\beta$ empirically. We conclude by proving, under a natural universality assumption, that performing kernel regression with a ridge that decreases with the size of the training set leads to similar learning curve exponents to those we obtain in the ridgeless case.
  • Supplementary Material: pdf
  • Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.
  • Code: https://github.com/fran-cagnetta/local_kernels
9 Replies

Loading