Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU

Mohammad Babaeizadeh; Iuri Frosio; Stephen Tyree; Jason Clemons; Jan Kautz

Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU

Mohammad Babaeizadeh, Iuri Frosio, Stephen Tyree, Jason Clemons, Jan Kautz

Published: 06 Feb 2017, Last Modified: 12 Oct 2025ICLR 2017 PosterReaders: Everyone

Abstract: We introduce a hybrid CPU/GPU version of the Asynchronous Advantage Actor-Critic (A3C) algorithm, currently the state-of-the-art method in reinforcement learning for various gaming tasks. We analyze its computational traits and concentrate on aspects critical to leveraging the GPU's computational power. We introduce a system of queues and a dynamic scheduling strategy, potentially helpful for other asynchronous algorithms as well. Our hybrid CPU/GPU version of A3C, based on TensorFlow, achieves a significant speed up compared to a CPU implementation; we make it publicly available to other researchers at https://github.com/NVlabs/GA3C.

TL;DR: Implementation and analysis of the computational aspect of a GPU version of the Asynchronous Advantage Actor-Critic (A3C) algorithm

Keywords: Reinforcement Learning

Conflicts: nvidia.com, unimi.it, mit.edu, ucl.ac.uk, wustl.edu, cornell.edu, fb.com, illinois.edu, microsoft.com, sharif.edu, umich.edu

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/reinforcement-learning-through-asynchronous/code)

16 Replies

Loading