H-InDex: Visual Reinforcement Learning with Hand-Informed Representations for Dexterous Manipulation

Published: 21 Sept 2023, Last Modified: 02 Nov 2023NeurIPS 2023 posterEveryoneRevisionsBibTeX
Keywords: Visual Reinforcement Learning, Representation Learning, Dexterous Manipulation
TL;DR: We propose H-InDex, a hand-informed visual representation for dexterous manipulation with reinforcement learning.
Abstract: Human hands possess remarkable dexterity and have long served as a source of inspiration for robotic manipulation. In this work, we propose a human $\textbf{H}$and-$\textbf{In}$formed visual representation learning framework to solve difficult $\textbf{Dex}$terous manipulation tasks ($\textbf{H-InDex}$) with reinforcement learning. Our framework consists of three stages: $\textit{(i)}$ pre-training representations with 3D human hand pose estimation, $\textit{(ii)}$ offline adapting representations with self-supervised keypoint detection, and $\textit{(iii)}$ reinforcement learning with exponential moving average BatchNorm. The last two stages only modify $0.36$% parameters of the pre-trained representation in total, ensuring the knowledge from pre-training is maintained to the full extent. We empirically study $\textbf{12}$ challenging dexterous manipulation tasks and find that $\textbf{H-InDex}$ largely surpasses strong baseline methods and the recent visual foundation models for motor control. Code and videos are available at https://yanjieze.com/H-InDex .
Supplementary Material: pdf
Submission Number: 4347
Loading