Open Peer Review. Open Publishing. Open Access. Open Discussion. Open Directory. Open Recommendations. Open API. Open Source.
Sample Efficient Actor-Critic with Experience Replay
Ziyu Wang, Victor Bapst, Nicolas Heess, Volodymyr Mnih, Remi Munos, Koray Kavukcuoglu, Nando de Freitas
Nov 04, 2016 (modified: Feb 23, 2017)ICLR 2017 conference submissionreaders: everyone
Abstract:This paper presents an actor-critic deep reinforcement learning agent with experience replay that is stable, sample efficient, and performs remarkably well on challenging environments, including the discrete 57-game Atari domain and several continuous control problems. To achieve this, the paper introduces several innovations, including truncated importance sampling with bias correction, stochastic dueling network architectures, and a new trust region policy optimization method.
TL;DR:Prepared for ICLR 2017.
Conflicts:google.com, ox.ac.uk, ubc.ca
Keywords:Deep learning, Reinforcement Learning
Enter your feedback below and we'll get back to you as soon as possible.