Adaptive Distributional Double Q-learning

Leif Döring; Maximilian Birr; Mihail Bîrsan

Adaptive Distributional Double Q-learning

Leif Döring, Maximilian Birr, Mihail Bîrsan

Published: 01 Aug 2024, Last Modified: 09 Oct 2024EWRL17EveryoneRevisionsBibTeXCC BY 4.0

Keywords: distributional reinforcement learning, double Q learning, adaptive learning, DQN, Actor-Critic

Abstract: Bias problems in the estimation of maxima of random variables are a well-known obstacle that drastically slows down $Q$-learning algorithms. We propose to use additional insight gained from distributional reinforcement learning to deal with the overestimation in a locally adaptive way. This helps to combine the strengths and weaknesses of the different $Q$-learning variants in a unified framework. Our framework ADDQ is simple to implement, existing RL algorithms can be improved with a few lines of additional code. We provide experimental results in tabular, Atari, and MuJoCo environments for discrete and continuous control problems, comparisons with state-of-the-art methods, and a proof of convergence.

Submission Number: 68

Loading