Keywords: Empowerment, Unsupervised Skill Learning, Unsupervised Reinforcement Learning, Self-supervised Reinforcement Learning
TL;DR: We present a more scalable empowerment algorithm that enables agent to learn large skillsets while only requiring a latent-predictive model rather than a simulator of the environment.
Abstract: Empowerment has the potential to help agents learn large skillsets, but is not yet a scalable solution for training general-purpose agents. Recent empowerment methods learn large skillsets by maximizing the mutual information between skills and states, but these approaches require a model of the transition dynamics, which can be challenging to learn in realistic settings with high-dimensional and stochastic observations. We present an algorithm, Latent-Predictive Empowerment (LPE), that can compute empowerment in a more scalable manner. LPE learns large skillsets by maximizing an objective that under certain conditions has the same optimal skillset as the mutual information between skills and states, but our objective is more tractable to optimize because it only requires learning a simpler latent-predictive model rather than a full simulator of the environment. We show empirically in a variety of settings, includes ones with high-dimensional observations and highly stochastic transition dynamics, that our empowerment objective learns similar-sized skillsets as the leading empowerment algorithm, which assumes access to a model of the transition dynamics, and outperforms other model-based approaches to empowerment.
Primary Area: reinforcement learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 12749
Loading