Exploration in Reinforcement Learning with Deep Covering Options

Yuu Jinnai; Jee Won Park; Marlos C. Machado; George Konidaris

Exploration in Reinforcement Learning with Deep Covering Options

Yuu Jinnai, Jee Won Park, Marlos C. Machado, George Konidaris

Published: 20 Dec 2019, Last Modified: 05 May 2023ICLR 2020 Conference Blind SubmissionReaders: Everyone

Keywords: Reinforcement learning, temporal abstraction, exploration

TL;DR: We introduce a method to automatically discover task-agnostic options that encourage exploration for reinforcement learning.

Abstract: While many option discovery methods have been proposed to accelerate exploration in reinforcement learning, they are often heuristic. Recently, covering options was proposed to discover a set of options that provably reduce the upper bound of the environment's cover time, a measure of the difficulty of exploration. Covering options are computed using the eigenvectors of the graph Laplacian, but they are constrained to tabular tasks and are not applicable to tasks with large or continuous state-spaces. We introduce deep covering options, an online method that extends covering options to large state spaces, automatically discovering task-agnostic options that encourage exploration. We evaluate our method in several challenging sparse-reward domains and we show that our approach identifies less explored regions of the state-space and successfully generates options to visit these regions, substantially improving both the exploration and the total accumulated reward.

Data: [MuJoCo](https://paperswithcode.com/dataset/mujoco)

Original Pdf: pdf

12 Replies

Loading