Curiosity-driven Exploration by Bootstrapping Features

Harri Edwards, Yuri Burda, Amos Storkey

Feb 15, 2018 (modified: Feb 15, 2018) ICLR 2018 Conference Blind Submission readers: everyone Show Bibtex
  • Abstract: We introduce CBF, an exploration method that works in the absence of rewards or end of episode signal. CBF is based on intrinsic reward derived from the error of a dynamics model operating in feature space. It was inspired by (Pathak et al., 2017), is easy to implement, and can achieve results such as passing four levels of Super Mario Bros, navigating VizDoom mazes and passing two levels of SpaceInvaders. We investigated the effect of combining the method with several auxiliary tasks, but find inconsistent improvements over the CBF baseline.
  • TL;DR: A simple intrinsic motivation method using forward dynamics model error in feature space of the policy.
  • Keywords: exploration, intrinsic motivation, reinforcement learning