H-GAP: Humanoid Control with a Generalist Planner

Published: 07 Nov 2023, Last Modified: 17 Nov 2023FMDM@NeurIPS2023EveryoneRevisionsBibTeX
Keywords: Generative Modeling, Humanoid Control, Model Predictive Control, Model-based Reinforcement Learning, Offline Reinforcement Learning
TL;DR: We present Humanoid Generalist Autoencoding Planner (H-GAP), a general-purpose trajectory generative model capable of adeptly handling various downstream control tasks with Model Predictive Control.
Abstract: Humanoid control is an important research challenge offering avenues for integration into human-centric infrastructures and enabling physics-driven humanoid animations. The daunting challenges in this field stem from the difficulty of optimizing in high-dimensional action spaces and the instability introduced by the bipedal morphology of humanoids. However, the extensive collection of human motion-captured data and the derived datasets of humanoid trajectories, such as MoCapAct, paves the way to tackle these challenges. In this context, we present Humanoid Generalist Autoencoding Planner (H-GAP), a state-action trajectory generative model trained on humanoid trajectories derived from human motion-captured data, capable of adeptly handling downstream control tasks with Model Predictive Control (MPC). For 56 degrees of freedom humanoid, we empirically demonstrate that H-GAP learns to represent and generate a wide range of motor behaviours. Further, without any learning from online interactions, it can also flexibly transfer these behaviours to solve novel downstream control tasks via planning. Notably, H-GAP excels established MPC baselines with access to the ground truth model, and is superior or comparable to offline RL methods trained for individual tasks. Finally, we do a series of empirical studies on the scaling properties of H-GAP, showing the potential for performance gains via additional data but not computing.
Submission Number: 24