2021 (modified: 16 May 2022)ICML 2021Readers: Everyone
Abstract:Q-learning, which seeks to learn the optimal Q-function of a Markov decision process (MDP) in a model-free fashion, lies at the heart of reinforcement learning. Focusing on the synchronous setting ...