Sample Efficient Actor-Critic with Experience Replay

Ziyu Wang, Victor Bapst, Nicolas Heess, Volodymyr Mnih, Remi Munos, Koray Kavukcuoglu, Nando de Freitas

Nov 04, 2016 (modified: Feb 23, 2017) ICLR 2017 conference submission readers: everyone
  • Abstract: This paper presents an actor-critic deep reinforcement learning agent with experience replay that is stable, sample efficient, and performs remarkably well on challenging environments, including the discrete 57-game Atari domain and several continuous control problems. To achieve this, the paper introduces several innovations, including truncated importance sampling with bias correction, stochastic dueling network architectures, and a new trust region policy optimization method.
  • TL;DR: Prepared for ICLR 2017.
  • Conflicts: google.com, ox.ac.uk, ubc.ca
  • Keywords: Deep learning, Reinforcement Learning

Loading