Distributed Prioritized Experience ReplayDownload PDF

15 Feb 2018 (modified: 07 Apr 2024)ICLR 2018 Conference Blind SubmissionReaders: Everyone
Abstract: We propose a distributed architecture for deep reinforcement learning at scale, that enables agents to learn effectively from orders of magnitude more data than previously possible. The algorithm decouples acting from learning: the actors interact with their own instances of the environment by selecting actions according to a shared neural network, and accumulate the resulting experience in a shared experience replay memory; the learner replays samples of experience and updates the neural network. The architecture relies on prioritized experience replay to focus only on the most significant data generated by the actors. Our architecture substantially improves the state of the art on the Arcade Learning Environment, achieving better final performance in a fraction of the wall-clock training time.
TL;DR: A distributed architecture for deep reinforcement learning at scale, using parallel data-generation to improve the state of the art on the Arcade Learning Environment benchmark in a fraction of the wall-clock training time of previous approaches.
Keywords: deep learning, reinforcement learning, distributed systems
Code: [![Papers with Code](/images/pwc_icon.svg) 15 community implementations](https://paperswithcode.com/paper/?openreview=H1Dy---0Z)
Data: [Arcade Learning Environment](https://paperswithcode.com/dataset/arcade-learning-environment)
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 13 code implementations](https://www.catalyzex.com/paper/arxiv:1803.00933/code)
10 Replies

Loading