Student First Author: yes
Keywords: Imitation Learning, Dexterous Manipulation
TL;DR: We contribute a structured approach for sample-efficient learning of dexterous manipulation skills for a 23 DoF physical hand-arm system from demonstrations, by leveraging Geometric Fabrics, a recent theoretical framework for robot motion generation.
Abstract: Learning dexterous manipulation policies for multi-fingered robots has been a long-standing challenge in robotics. Existing methods either limit themselves to highly constrained problems and smaller models to achieve extreme sample efficiency or sacrifice sample efficiency to gain capacity to solve more complex tasks with deep neural networks. In this work, we develop a structured approach to sample-efficient learning of dexterous manipulation skills from demonstrations by leveraging recent advances in robot motion generation and control. Specifically, our policy structure is induced by Geometric Fabrics - a recent framework that generalizes classical mechanical systems to allow for flexible design of expressive robot motions. To avoid the cumbersome manual design required by existing motion generators, we introduce Neural Geometric Fabric (NGF) - a framework that learns Geometric Fabric-based policies from data. NGF policies are provably stable and capable of encoding speed-invariant geometries of complex motions in multiple task spaces simultaneously. We demonstrate that NGFs can learn to perform a variety of dexterous manipulation tasks on a 23-DoF hand-arm physical robotic platform purely from demonstrations. Results from comprehensive comparative and ablative experiments show that NGF's structure and action spaces help learn acceleration-based policies that consistently outperform state-of-the-art baselines like Riemannian Motion Policies (RMPs), and other commonly used networks, such as feed-forward and recurrent neural networks. More importantly, we demonstrate that NGFs do not rely on often-used and expertly-designed operational-space controllers, promoting an advancement towards efficiently learning safe, stable, and high-dimensional controllers.
Supplementary Material: zip