Keywords: Gaussian process regression, kernel ridge regression, generalization error, power law, neural tangent kernel
Abstract: We characterize the power-law asymptotics of learning curves for Gaussian process regression (GPR) under the assumption that the eigenspectrum of the prior and the eigenexpansion coefficients of the target function follow a power law. Under similar assumptions, we leverage the equivalence between GPR and kernel ridge regression (KRR) to show the generalization error of KRR. Infinitely wide neural networks can be related to GPR with respect to the neural network GP kernel and the neural tangent kernel, which in several cases is known to have a power-law spectrum. Hence our methods can be applied to study the generalization error of infinitely wide neural networks. We present toy experiments demonstrating the theory.
One-sentence Summary: We derive the power-law decay rate of the generalization error in Gaussian process regression depending on the eigenspectrum of the prior and the target.