UCB Momentum Q-learning: Correcting the bias without forgettingDownload PDFOpen Website

Published: 01 Jan 2021, Last Modified: 12 May 2023ICML 2021Readers: Everyone
Abstract: We propose UCBMQ, Upper Confidence Bound Momentum Q-learning, a new algorithm for reinforcement learning in tabular and possibly stage-dependent, episodic Markov decision process. UCBMQ is based on...
0 Replies

Loading