RHINO: Learning Real-Time Humanoid-Human-Object Interaction from Human Demonstrations

Published: 01 Sept 2025, Last Modified: 27 Sept 2025HRSIC 2025 OralEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Human-Robot Interatcion, Humanoid Robots, Human-object-humanoid Interaction
TL;DR: We present RHINO, the first real-time humanoid interaction framework that learns from human demonstrations to enable dynamic task-switching and instant response to human instructions.
Abstract: Humanoid robots have shown success in locomotion and manipulation. Despite these basic abilities, humanoids are still required to quickly understand human instructions and react based on human interaction signals to become valuable assistants in human daily life. Unfortunately, most existing works only focus on multi-stage interactions, treating each task separately, and neglecting real-time feedback. In this work, we aim to empower humanoid robots with real-time reaction abilities to achieve various tasks, allowing humans to interrupt robots at any time, and making robots respond to humans immediately. To support such abilities, we propose a general humanoid-human-object interaction framework, named \our, i.e., Real-time Humanoid-human Interaction and Object manipulation. RHINO provides a unified view of human intent prediction, interactive motion, instruction-based manipulation, and safety concerns, over multiple human signal modalities. RHINO is a hierarchical learning framework that enables humanoids to acquire interaction skills from human-human-object demonstration and teleoperation data, while generalizing across diverse human appearances. 1) object manipulation skills from teleoperation datasets, 2) reactive motion skills and 3) human intent from human-object-human interaction datasets. In particular, it decouples the interaction process into two levels: 1) a high-level planner inferring human intents from real-time human behaviors; and 2) a low-level controller achieving expressive interaction behaviors and object manipulation skills based on the predicted intents. We evaluate our framework with human studies and quantitative experiments on a real humanoid robot and demonstrate its effectiveness and robustness in various scenarios. We believe RHINO helps bring robots, especially humanoids, closer to our daily lives.
Supplementary Material: zip
Submission Number: 11
Loading