Tightening the Dependence on Horizon in the Sample Complexity of Q-LearningDownload PDFOpen Website

2021 (modified: 16 May 2022)ICML 2021Readers: Everyone
Abstract: Q-learning, which seeks to learn the optimal Q-function of a Markov decision process (MDP) in a model-free fashion, lies at the heart of reinforcement learning. Focusing on the synchronous setting ...
0 Replies

Loading