A Probabilistic Perspective on Reinforcement Learning via Supervised LearningDownload PDF

04 Mar 2022, 07:18 (edited 08 Apr 2022)ICLR 2022 GPL PosterReaders: Everyone
  • Keywords: offline rl, control as inference, rl via supervised learning, probabilistic inference
  • TL;DR: A Probabilistic Perspective on Reinforcement Learning via Supervised Learning algorithms
  • Abstract: Reinforcement Learning via Supervised Learning (RvS) only uses supervised techniques to learn desirable behaviors from large datasets. RvS has attracted much attention lately due to its simplicity and ability to leverage diverse trajectories. We introduce Density to Decision (D2D), a new framework, to unify a myriad of RvS algorithms. The Density to Decision framework formulates RvS as a two-step process: i) density estimation via supervised learning and ii) decision making via exponential tilting of the density. Using our framework, we categorise popular RvS algorithms and show how they are different by the design choices in their implementation. We then introduce a novel algorithm, Implicit RvS, leveraging powerful density estimation techniques that can easily be tilted to produce desirable behaviors. We compare the performance of a suite of RvS algorithms on the D4RL benchmark. Finally, we highlight the limitations of current RvS algorithms in comparison with traditional RL ones.
1 Reply

Loading