Quantitative convergence of trained neural networks to Gaussian processes

Andrea Agazzi; Eloy Mosig García; Dario Trevisan

Quantitative convergence of trained neural networks to Gaussian processes

Andrea Agazzi, Eloy Mosig García, Dario Trevisan

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Neural networks, NTK, Gaussian process, Wasserstein distance, Gaussian approximation, shallow neural networks, wide limit, infinite-width neural network

TL;DR: We prove quantitative convergence estimates for single layer neural networks in the NTK regime to gaussian processes at positive training time

Abstract: In this paper, we study the quantitative convergence of shallow neural networks trained via gradient descent to their associated Gaussian processes in the infinite-width limit. While previous work has established qualitative convergence under broad settings, precise, finite-width estimates remain limited, particularly during training. We provide explicit upper bounds on the quadratic Wasserstein distance between the network output and its Gaussian approximation at any training time $t \ge 0$, demonstrating polynomial decay with network width. Our results quantify how architectural parameters, such as width and input dimension, influence convergence, and how training dynamics affect the approximation error

Supplementary Material: zip

Primary Area: Probabilistic methods (e.g., variational inference, causal inference, Gaussian processes)

Submission Number: 13074

Loading