CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics

Jiawei Gao; Ziqin Wang; Zeqi Xiao; Jingbo Wang; Tai Wang; Jinkun Cao; Xiaolin Hu; Si Liu; Jifeng Dai; Jiangmiao Pang

CooHOI: Learning Cooperative Human-Object Interaction with Manipulated Object Dynamics

Jiawei Gao, Ziqin Wang, Zeqi Xiao, Jingbo Wang, Tai Wang, Jinkun Cao, Xiaolin Hu, Si Liu, Jifeng Dai, Jiangmiao Pang

Published: 31 Oct 2024, Last Modified: 08 Nov 2024CoRL 2024 Workshop WCBMEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Humanoid Robots, Human-Object Interactions, Multi-Agent Cooperation

TL;DR: A framework for learning multi-humanoid cooperative carrying tasks.

Abstract: Enabling humanoid robots to clean rooms has long been a pursued dream within humanoid research communities. However, many real-world tasks, such as moving large and heavy furniture, require multi-humanoid collaboration. Given the scarcity of motion capture data on multi-humanoid collaboration and the efficiency challenges associated with multi-agent learning and control, these tasks cannot be straightforwardly addressed using training paradigms designed for single-agent scenarios. In this paper, we introduce **Coo**perative **H**uman-**O**bject **I**nteraction (**CooHOI**), a framework designed to tackle the challenge of multi-humanoid object transportation problem through a two-phase learning paradigm: individual skill learning and subsequent policy transfer. First, a single humanoid character learns to interact with objects through imitation learning from human motion priors. Then, the humanoid learns to collaborate with others by considering the shared dynamics of the manipulated object using centralized training and decentralized execution (CTDE) multi-agent RL algorithms. When one agent interacts with the object, resulting in specific object dynamics changes, the other agents learn to respond appropriately, thereby achieving implicit communication and coordination between teammates. Unlike previous approaches that relied on tracking-based methods for multi-humanoid HOI, CooHOI is inherently efficient, does not depend on motion capture data of multi-humanoid interactions, and can be seamlessly extended to include more participants and a wide range of object types.

Submission Number: 8

Loading