Published: 01 Jan 2021, Last Modified: 12 May 2023ICML 2021Readers: Everyone
Abstract:We propose UCBMQ, Upper Confidence Bound Momentum Q-learning, a new algorithm for reinforcement learning in tabular and possibly stage-dependent, episodic Markov decision process. UCBMQ is based on...