DexWild: Dexterous Human Interactions for In-the-Wild Robot Policies

Tony Tao; Mohan Kumar Srirama; Jason Jingzhou Liu; Kenneth Shaw; Deepak Pathak

DexWild: Dexterous Human Interactions for In-the-Wild Robot Policies

Tony Tao, Mohan Kumar Srirama, Jason Jingzhou Liu, Kenneth Shaw, Deepak Pathak

Published: 03 Jun 2025, Last Modified: 16 Jun 2025RSS MoMa 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Robot Learning: Imitation Learning, Robot Learning: Foundation Models, Grasping & Manipulation

TL;DR: DexWild introduces a scalable, human hand data collection system and co-training framework that enables dexterous robot policies to generalize in the open world.

Abstract: Large-scale, diverse robot datasets have emerged as a promising path toward enabling dexterous manipulation policies to generalize to novel environments, but acquiring such datasets presents many challenges. While teleoperation provides high- fidelity datasets, its high cost limits its scalability. Instead, what if people could use their own hands, just as they do in everyday life, to collect data? In DexWild, a diverse team of data collectors uses their hands to collect hours of interactions across a multitude of environments and objects. To record this data, we create DexWild-System, a low-cost, mobile, and easy-to-use device. The DexWild learning framework co-trains on both human and robot demonstrations, leading to improved performance compared to training on each dataset individually. This combination results in robust robot policies capable of generalizing to novel environments, tasks, and embodiments with minimal additional robot-specific data. Experimental results demonstrate that DexWild significantly improves performance, achieving a 68.5% success rate in unseen environments—nearly four times higher than policies trained with robot data only—and offering 5.8× better cross-embodiment generalization.

Submission Number: 2

Loading