Episodic Reinforcement Learning with Associative Memory

Guangxiang Zhu*; Zichuan Lin*; Guangwen Yang; Chongjie Zhang

Episodic Reinforcement Learning with Associative Memory

Guangxiang Zhu, Zichuan Lin, Guangwen Yang, Chongjie Zhang

Published: 20 Dec 2019, Last Modified: 05 May 2023ICLR 2020 Conference Blind SubmissionReaders: Everyone

Abstract: Sample efficiency has been one of the major challenges for deep reinforcement learning. Non-parametric episodic control has been proposed to speed up parametric reinforcement learning by rapidly latching on previously successful policies. However, previous work on episodic reinforcement learning neglects the relationship between states and only stored the experiences as unrelated items. To improve sample efficiency of reinforcement learning, we propose a novel framework, called Episodic Reinforcement Learning with Associative Memory (ERLAM), which associates related experience trajectories to enable reasoning effective strategies. We build a graph on top of states in memory based on state transitions and develop a reverse-trajectory propagation strategy to allow rapid value propagation through the graph. We use the non-parametric associative memory as early guidance for a parametric reinforcement learning model. Results on navigation domain and Atari games show our framework achieves significantly higher sample efficiency than state-of-the-art episodic reinforcement learning models.

Keywords: Deep Reinforcement Learning, Episodic Control, Episodic Memory, Associative Memory, Non-Parametric Method, Sample Efficiency

Original Pdf: pdf

8 Replies

Loading

Episodic Reinforcement Learning with Associative Memory

Guangxiang Zhu*, Zichuan Lin*, Guangwen Yang, Chongjie Zhang

Guangxiang Zhu, Zichuan Lin, Guangwen Yang, Chongjie Zhang