Rewardless Open-Ended Learning (ROEL)

Alexander Quessy; Thomas Stuart Richardson

Rewardless Open-Ended Learning (ROEL)

Alexander Quessy, Thomas Stuart Richardson

Published: 28 Jan 2022, Last Modified: 13 Feb 2023ICLR 2022 SubmittedReaders: Everyone

Keywords: unsupervised reinforcement learning, open-ended learning, skill discovery

Abstract: Open-ended learning algorithms aim to automatically generate challenges and solutions to an unending sequence of learning opportunities. In Reinforcement Learning (RL) recent approaches to open-ended learning, such as Paired Open-Ended Trailblazer (POET), focus on collecting a diverse set of solutions based on the novelty of an agents pre-defined reward function. In many practical RL tasks defining an effective reward function a priori is often hard and can hinder an agents ability to explore many behaviors that could ultimately be more performant. In this work we combine open-ended learning with unsupervised reinforcement learning to train agents to learn a diverse set of complex skills. We propose a procedure to combine skill-discovery via mutual information, using the POET algorithm as an open-ended framework to teach agents increasingly complex groups of diverse skills. Experimentally we demonstrate this approach yields agents capable of demonstrating identifiable skills over a range of environments, that can be extracted and utilized to solve a variety of tasks.

One-sentence Summary: We present ROEL, an unsupervised open-ended reinforcement learning algorithm that aims to automatically generate increasingly complex & useful skills

Supplementary Material: zip

14 Replies

Loading