Finite-Sample Wasserstein Error Bounds and Concentration Inequalities for Nonlinear Stochastic Approximation
Abstract: This paper derives non-asymptotic error bounds for nonlinear stochastic approximation
algorithms in the Wasserstein-p distance. To obtain explicit finite-sample guarantees for the
last iterate, we develop a coupling argument that compares the discrete-time process to a
limiting Ornstein-Uhlenbeck process. Our analysis applies to algorithms driven by general
noise conditions, including martingale differences and functions of ergodic Markov chains.
Complementing this result, we handle the convergence rate of the Polyak-Ruppert average
through a direct analysis that applies under the same general setting.
Assuming the driving noise satisfies a non-asymptotic central limit theorem, we show that
the normalized last iterates converge to a Gaussian distribution in the p-Wasserstein distance at a
rate of order $\gamma_n^{1/6}$, where $\gamma_n$ is the step size. Similarly, the Polyak-Ruppert average is shown to converge in the Wasserstein distance at a rate of order $n^{−1/6}$. These distributional guarantees
imply high-probability concentration inequalities that improve upon those derived from moment
bounds and Markov’s inequality. We demonstrate the utility of this approach by considering two
applications: (1) linear stochastic approximation, where we explicitly quantify the transition
from heavy-tailed to Gaussian behavior of the iterates, thereby bridging the gap between recent
finite-sample analyses and asymptotic theory and (2) stochastic gradient descent, where we
establish rate of convergence to the central limit theorem.
Loading