Sample Efficient Actor-Critic with  Experience Replay

Ziyu Wang; Victor Bapst; Nicolas Heess; Volodymyr Mnih; Remi Munos; Koray Kavukcuoglu; Nando de Freitas

Sample Efficient Actor-Critic with Experience Replay

Ziyu Wang, Victor Bapst, Nicolas Heess, Volodymyr Mnih, Remi Munos, Koray Kavukcuoglu, Nando de Freitas

Published: 06 Feb 2017, Last Modified: 12 Oct 2025ICLR 2017 PosterReaders: Everyone

Abstract: This paper presents an actor-critic deep reinforcement learning agent with experience replay that is stable, sample efficient, and performs remarkably well on challenging environments, including the discrete 57-game Atari domain and several continuous control problems. To achieve this, the paper introduces several innovations, including truncated importance sampling with bias correction, stochastic dueling network architectures, and a new trust region policy optimization method.

TL;DR: Prepared for ICLR 2017.

Conflicts: google.com, ox.ac.uk, ubc.ca

Keywords: Deep learning, Reinforcement Learning

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 7 code implementations](https://www.catalyzex.com/paper/sample-efficient-actor-critic-with-experience/code)

19 Replies

Loading