Interactive Humanoid: Online Full Body Human Motion Reaction Synthesis With Social Affordance Forecasting and Canonicalization
Keywords: Reaction Synthesis, SE(3)-Equivariant Neural Networks, Local Frame Learning
TL;DR: We construct two datasets and propose a unified method for the online full-body motion reaction synthesis task.
Abstract: We focus on the human-humanoid interaction problem optionally with an object. We propose a new task named online full-body motion reaction synthesis, which generates humanoid reactions based on the human actor's motions. The previous work only focuses on human interaction without objects and generates body reactions without hand. Besides, they also do not consider the task as an online setting, which means the reactor can only see the current information and cannot perceive the future actions of the actor. To support the task of online full-body motion reaction synthesis, we construct two datasets named HHI and CoChair and propose a unified method. Specifically, we encode the motion of human actors and objects from an interaction-centric view through a social affordance representation.
Then we leverage a social affordance forecasting scheme to enable the reactor to predict based on the imagined future. We also use SE(3)-Equivariant Neural Networks to learn the local frame to canonicalize the social affordance. Experiments demonstrate that our approach effectively generates high-quality reactions on HHI and CoChair. Furthermore, we also validate our method on existing human interaction datasets Interhuman and Chi3D in real-time at 25 fps.
Supplementary Material: zip
Submission Number: 99
Loading