Keywords: Gaussian processes, neural scaling laws, wide networks, infinite neural networks
TL;DR: We improve the Gaussian approximation in wide neural networks by calculating 1/n sized corrections
Abstract: Gaussian approximations are often used for developing the theory of how neural networks scale as the number of neurons grows large. However, it is known that these approximations break down as depth increases due to the accumulation of approximation errors. To remedy this, we provide a new family of distributions that appear naturally in neural networks and provide more accurate approximations than the usual Gaussian approximation. We develop a method for obtaining the probability density function via Hermite polynomials and connect this to the classical Edgeworth expansion.
Is Neurips Submission: No
Submission Number: 29
Loading