Concentration of Contractive Stochastic Approximation and Reinforcement Learning

Siddharth Chandak, Vivek S. Borkar

Published: 2021, Last Modified: 10 May 2023CoRR 2021Readers: Everyone

Abstract: Using a martingale concentration inequality, concentration bounds `from time $n_0$ on' are derived for stochastic approximation algorithms with contractive maps and both martingale difference and Markov noises. These are applied to reinforcement learning algorithms, in particular to asynchronous Q-learning and TD(0).

0 Replies