Unsupervised Skill Discovery as Exploration for Learning Agile Locomotion

Published: 08 Aug 2025, Last Modified: 16 Sept 2025CoRL 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Unsupervised Reinforcement Learning, Locomotion, Quadruped, Skill Discovery
TL;DR: By applying unsupervised reinforcement learning, we trained agile locomotion behaviors without relying on curriculum or reference motions.
Abstract: Exploration is crucial for legged robots to learn agile locomotion behaviors capable of overcoming diverse obstacles. For example, a robot may need to try different contact patterns and momentum profiles to successfully jump over an obstacle—but encouraging such diverse exploration is inherently challenging. As a result, training these behaviors often relies on additional techniques such as extensive reward engineering, expert demonstrations, or curriculum learning. However, these approaches limit generalizability, especially when prior knowledge or demonstration data is unavailable. In this work, we propose using unsupervised skill discovery as a skill-level exploration strategy to significantly reduce human engineering effort. Our learning framework enables the agent to autonomously discover diverse skills to overcome complex obstacles. To dynamically regulate the degree of exploration throughout training, we introduce a bi-level optimization process that learns a parameter to balance two distinct reward signals. We demonstrate that our method enables quadrupedal robots to acquire highly agile behaviors—including crawling, climbing, leaping, and complex maneuvers such as jumping off vertical walls. Finally, we successfully deploy the learned policy on real hardware, validating its transferability to the real world.
Supplementary Material: zip
Spotlight: zip
Submission Number: 241
Loading