Unsupervised Skill Discovery aims to learn diverse skills without extrinsic rewards, using them as priors for downstream tasks. Existing methods focus on empowerment or entropy maximization but often result in static or non-discriminable skills. Instead, our method, Hilbert Unsupervised Skill Discovery (HUSD), combines $f$-divergence with Integral Probability Metrics to promote behavioral diversity and disentanglement. HUSD maximizes the Maximum Mean Discrepancy between the joint distribution of skills and states and their marginals in Reproducing Kernel Hilbert Space, leading to better exploration and skill separability. Our results on Unsupervised RL Benchmarks show HUSD outperforms previous exploration algorithms on state-based tasks.
Track: Full track
Keywords: Unsupervised Skill Discovery, RKHS, MMD
Abstract:
Submission Number: 14
Loading