- Keywords: meta-learning, exploration, curiosity
- TL;DR: Meta-learning curiosity algorithms by searching through a rich space of programs yields novel mechanisms that generalize across very different reinforcement-learning domains.
- Abstract: Exploration is a key component of successful reinforcement learning, but optimal approaches are computationally intractable, so researchers have focused on hand-designing mechanisms based on exploration bonuses and intrinsic reward, some inspired by curious behavior in natural systems. In this work, we propose a strategy for encoding curiosity algorithms as programs in a domain-specific language and searching, during a meta-learning phase, for algorithms that enable RL agents to perform well in new domains. Our rich language of programs, which can combine neural networks with other building blocks including nearest-neighbor modules and can choose its own loss functions, enables the expression of highly generalizable programs that perform well in domains as disparate as grid navigation with image input, acrobot, lunar lander, ant and hopper. To make this approach feasible, we develop several pruning techniques, including learning to predict a program's success based on its syntactic properties. We demonstrate the effectiveness of the approach empirically, finding curiosity strategies that are similar to those in published literature, as well as novel strategies that are competitive with them and generalize well.
- Code: http://bit.ly/meta-learning-curiosity-algs