Information-Directed Sampling for Reinforcement Learning

Junyang Qian, Junzi Zhang

01 Sept 2020OpenReview Archive Direct UploadReaders: Everyone

Abstract: In the field of Reinforcement Learning, thanks to its interpretability and actual boosting in performance, information theoretic criteria have embraced increasing popularity in recent years. And when restricted to bandit learning problems, Information-Directed Sampling (IDS) has been introduced with proved asymptotically optimal regret bounds. In this report, we extend IDS to solving general reinforcement learning problems, with the hope of more reliable regret guarantees. Practical algorithms are proposed for both model-based and model-free settings, and numerical experiments shows the potential improvement in data efficiency through our algorithms.

0 Replies