Successor Options : An Option Discovery Algorithm for Reinforcement Learning

Manan Tomar*; Rahul Ramesh*; Balaraman Ravindran

Successor Options : An Option Discovery Algorithm for Reinforcement Learning

Manan Tomar, Rahul Ramesh, Balaraman Ravindran

27 Sept 2018 (modified: 05 May 2023)ICLR 2019 Conference Blind SubmissionReaders: Everyone

Abstract: Hierarchical Reinforcement Learning is a popular method to exploit temporal abstractions in order to tackle the curse of dimensionality. The options framework is one such hierarchical framework that models the notion of skills or options. However, learning a collection of task-agnostic transferable skills is a challenging task. Option discovery typically entails using heuristics, the majority of which revolve around discovering bottleneck states. In this work, we adopt a method complementary to the idea of discovering bottlenecks. Instead, we attempt to discover ``landmark" sub-goals which are prototypical states of well connected regions. These sub-goals are points from which densely connected set of states are easily accessible. We propose a new model called Successor options that leverages Successor Representations to achieve the same. We also design a novel pseudo-reward for learning the intra-option policies. Additionally, we describe an Incremental Successor options model that iteratively builds options and explores in environments where exploration through primitive actions is inadequate to form the Successor Representations. Finally, we demonstrate the efficacy of our approach on a collection of grid worlds and on complex high dimensional environments like Deepmind-Lab.

Keywords: Hierarchical Reinforcement Learning

TL;DR: An option discovery method for Reinforcement Learning using the Successor Representation

16 Replies

Loading

Successor Options : An Option Discovery Algorithm for Reinforcement Learning

Manan Tomar*, Rahul Ramesh*, Balaraman Ravindran

Manan Tomar, Rahul Ramesh, Balaraman Ravindran