Mimicking actions is a good strategy for beginners: Fast Reinforcement Learning with Expert Action Sequences

Sep 27, 2018 ICLR 2019 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: Imitation Learning is the task of mimicking the behavior of an expert player in a Reinforcement Learning(RL) Environment to enhance the training of a fresh agent (called novice) beginning from scratch. Most of the Reinforcement Learning environments are stochastic in nature, i.e., the state sequences that an agent may encounter usually follow a Markov Decision Process (MDP). This makes the task of mimicking difficult as it is very unlikely that a new agent may encounter same or similar state sequences as an expert. Prior research in Imitation Learning proposes various ways to learn a mapping between the states encountered and the respective actions taken by the expert while mostly being agnostic to the order in which these were performed. Most of these methods need considerable number of states-action pairs to achieve good results. We propose a simple alternative to Imitation Learning by appending the novice‚Äôs action space with the frequent short action sequences that the expert has taken. This simple modification, surprisingly improves the exploration and significantly outperforms alternative approaches like Dataset Aggregation. We experiment with several popular Atari games and show significant and consistent growth in the score that the new agents achieve using just a few expert action sequences.
  • Keywords: Reinforcement Learning, Imitation Learning, Atari, A3C, GA3C
  • TL;DR: Appending most frequent action pairs from an expert player to a novice RL agent's action space improves the scores by huge margin.
0 Replies