Keywords: unsupervised skill discovery, reinforcement learning, legged robots
TL;DR: A modular framework for unsupervised skill discovery that factorizes the state space, integrates symmetry and safety priors, and enables zero-shot deployment of structured, interpretable skills on real quadrupedal robots.
Abstract: Unsupervised Skill Discovery (USD) allows agents to autonomously learn diverse behaviors without task-specific rewards. While recent USD methods have shown promise, their application to real-world robotics remains underexplored.
In this paper, we propose a modular USD framework to address the challenges in safety, interpretability, and deployability of the learned skills.
Our approach factorizes the state space to learn disentangled skill representations and assigns different skill discovery algorithms to each factor based on the desired intrinsic reward function.
To encourage structured morphology-aware skills, we introduce symmetry-based inductive biases tailored to individual factors. We also incorporate a style factor and regularization penalties to promote safe and robust behaviors.
We evaluate our framework in simulation using a quadrupedal robot and demonstrate zero-shot transfer of the learned skills to real hardware. Our results show that factorization and symmetry lead to the discovery of structured, human-interpretable behaviors, while the style factor and penalties enhance safety and diversity. Additionally, we show that the learned skills can be used for downstream tasks and perform on par with oracle policies trained with hand-crafted rewards.
To facilitate future research, we will release our code upon publication.
Supplementary Material: zip
Submission Number: 702
Loading