Analytically Tractable Bayesian Deep Q-LearningDownload PDF

Published: 28 Jan 2022, Last Modified: 22 Oct 2023ICLR 2022 SubmittedReaders: Everyone
Keywords: Bayesian Learning, Probabilistic Methods, Uncertainty Quantification, Reinforcement Learning, Deep Q-learning
Abstract: Reinforcement learning (RL) has gained increasing interest since the demonstration it was able to reach human performance on video game benchmarks using deep Q-learning (DQN). The current consensus of DQN for training neural networks (NNs) on such complex environments is to rely on gradient-descent optimization (GD). This consensus ignores the uncertainty of the NN's parameters which is a key aspect for the selection of an optimal action given a state. Although alternative Bayesian deep learning methods exist, most of them still rely on GD and numerical approximations, and they typically do not scale on complex benchmarks such as the Atari game environment. In this paper, we present how we can adapt the temporal difference Q-learning framework to make it compatible with the tractable approximate Gaussian inference (TAGI) which allows estimating the posterior distribution of NN's parameters using a closed-form analytical method. Throughout the experiments with on- and off-policy reinforcement learning approaches, we demonstrate that TAGI can reach a performance comparable to backpropagation-trained networks while using only half the number of hyperparameters, and without relying on GD or numerical approximations.
One-sentence Summary: We apply tractable approximate Bayesian inference to deep reinforcement learning
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 2 code implementations](https://www.catalyzex.com/paper/arxiv:2106.11086/code)
13 Replies

Loading