Unsupervised Skill Discovery in Non-Markov Settings with Empowerment

Published: 01 Jul 2025, Last Modified: 01 Jul 2025RLBrew: Ingredients for Developing Generalist Agents workshop (RLC 2025)EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Unsupervised Skill Discovery, Empowerment, Representation Learning, Unsupervised Reinforcement Learning, Information Seeking, Partial Observability
TL;DR: We introduce an unsupervised skill discovery algorithm that can learn large skillsets in non-Markov settings.
Abstract: General purpose agents must be able to execute a large number of skills in non-Markov settings. Yet learning diverse sets of policies in these domains is challenging because agents also need to learn representations that preserve information about the underlying state found in histories of actions and observations. We introduce an Empowerment-based unsupervised skill discovery algorithm for building skillsets in non-Markov settings. The algorithm maximizes a mutual information objective with respect to both a recurrent neural network (RNN) and a skill-conditioned policy, enabling agents to simultaneously learn a representation and a large number of policies conditioned on the learned representation. We prove that our objective encourages RNNs that preserve information about the underlying state. We also demonstrate empirically that our approach can learn large skillsets ranging from hundreds to thousands of skills in three small, non-Markov settings.
Submission Number: 22
Loading