Central Limit Theorems for Asynchronous Averaged Q-Learning

Xingtu Liu

Central Limit Theorems for Asynchronous Averaged Q-Learning

Xingtu Liu

Published: 22 Sept 2025, Last Modified: 01 Dec 2025NeurIPS 2025 WorkshopEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Q-Learning, Central Limit Theorem, Stochastic Approximation, Reinforcement Learning

Abstract: This paper establishes central limit theorems for Polyak–Ruppert averaged Q-learning under asynchronous updates. We present a non-asymptotic central limit theorem, where the convergence rate in Wasserstein distance explicitly reflects the dependence on the number of iterations, state–action space size, the discount factor, and the quality of exploration. In addition, we derive a functional central limit theorem, showing that the partial-sum process converges weakly to a Brownian motion.

Submission Number: 109

Loading