Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning

Published: 25 Sept 2024, Last Modified: 16 Jan 2025NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Linear Stochastic Approximation, Normal Approximation, Bootstrap Validity
TL;DR: We prove rates of normal approximation for the iterates of the linear stochastic approximation algorithm and non-asymptotic guarantees for constructing confidence intervals with multiplier bootstrap
Abstract: In this paper, we obtain the Berry–Esseen bound for multivariate normal approximation for the Polyak-Ruppert averaged iterates of the linear stochastic approximation (LSA) algorithm with decreasing step size. Moreover, we prove the non-asymptotic validity of the confidence intervals for parameter estimation with LSA based on multiplier bootstrap. This procedure updates the LSA estimate together with a set of randomly perturbed LSA estimates upon the arrival of subsequent observations. We illustrate our findings in the setting of temporal difference learning with linear function approximation.
Supplementary Material: zip
Primary Area: Probabilistic methods (for example: variational inference, Gaussian processes)
Submission Number: 9170
Loading