Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling

Anonymous

Nov 03, 2017 (modified: Nov 03, 2017) ICLR 2018 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: Recent interest in decision making with deep neural networks has led to a wide development of practical methods that trade off exploration and exploitation. Bayesian approaches to deep learning are especially appealing for this purpose as they can provide accurate uncertainty estimates as input for reinforcement learning algorithms. However, these methods are rarely compared on benchmarks that evaluate the impact of their approximations in terms of decision-making performance, and their empirical effectiveness seems poorly understood. In this paper, we compare a variety of well-established and recent methods under the lens of Thompson Sampling over a series of contextual bandit problems.
  • TL;DR: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling
  • Keywords: exploration, Thompson Sampling, Bayesian neural networks, bandits, reinforcement learning, variational inference, Monte Carlo

Loading