UCB Momentum Q-learning: Correcting the bias without forgetting

Pierre Ménard, Omar Darwiche Domingues, Xuedong Shang, Michal Valko

Published: 2021, Last Modified: 12 May 2023ICML 2021Readers: Everyone

Abstract: We propose UCBMQ, Upper Confidence Bound Momentum Q-learning, a new algorithm for reinforcement learning in tabular and possibly stage-dependent, episodic Markov decision process. UCBMQ is based on...

0 Replies