Keywords: reinforcement learning, self-supervision, POMDP
TL;DR: event discovery to represent the history for the agent in RL
Abstract: Environments in Reinforcement Learning (RL) are usually only partially observable. To address this problem, a possible solution is to provide the agent with information about past observations. While common methods represent this history using a Recurrent Neural Network (RNN), in this paper we propose an alternative representation which is based on the record of the past events observed in a given episode. Inspired by the human memory, these events describe only important changes in the environment and, in our approach, are automatically discovered using self-supervision.
We evaluate our history representation method using two challenging RL benchmarks: some games of the Atari-57 suite and the 3D environment Obstacle Tower. Using these benchmarks we show the advantage of our solution with respect to common RNN-based approaches.
Code: https://github.com/iclr2020anon/EDHR
Original Pdf: pdf
29 Replies
Loading