TL;DR: Nice and accurate predictions for DNN learning curves using a novel field theory approach
Abstract: A series of recent works established a rigorous correspondence between very wide deep neural networks (DNNs), trained in a particular manner, and noiseless Bayesian Inference with a certain Gaussian Process (GP) known as the Neural Tangent Kernel (NTK). Here we extend a known field-theory formalism for GP inference to get a detailed understanding of learning-curves in DNNs trained in the regime of this correspondence (NTK regime). In particular, a renormalization-group approach is used to show that noiseless GP inference using NTK, which lacks a good analytical handle, can be well approximated by noisy GP inference on a related kernel we call the renormalized NTK. Following this, a perturbation-theory analysis is carried in one over the dataset-size yielding analytical expressions for the (fixed-teacher/fixed-target) leading and sub-leading asymptotics of the learning curves. At least for uniform datasets, a coherent picture emerges wherein fully-connected DNNs have a strong implicit bias towards functions which are low order polynomials of the input.
Keywords: Gaussian Processes, Neural Tangent Kernel, Learning Curves, Field Theory, Statistical Mechanics, Generalization, Deep neural networks
Original Pdf: pdf
8 Replies
Loading