Keywords: Real-to-Sim; Digital Twin; Sim-to-Real Transfer
TL;DR: An end-to-end framework for automatically generating fully interactive simulated scenes ("digital cousins") from a single real-world RGB image enabling training of robot policies that can be deployed zero-shot in the real world.
Abstract: Training robot policies in the real world can be unsafe, costly, and difficult to scale. Simulation serves as an inexpensive and potentially limitless source of training data, but suffers from the semantics and physics disparity between simulated and real-world environments. These discrepancies can be minimized by training in *digital twins*, which serve as virtual replicas of a real scene but are expensive to generate and cannot produce cross-domain generalization. To address these limitations, we propose the concept of *digital cousins*, a virtual asset or scene that, unlike a *digital twin*, does not explicitly model a real-world counterpart but still exhibits similar geometric and semantic affordances. As a result, *digital cousins* simultaneously reduce the cost of generating an analogous virtual environment while also facilitating better generalization across domains by providing a distribution of similar training scenes. Leveraging digital cousins, we introduce a novel method for the **A**utomatic **C**reation of **D**igital **C**ousins (**ACDC**), and propose a fully automated real-to-sim-to-real pipeline for generating fully interactive scenes and training robot policies that can be deployed zero-shot in the original scene. We find that **ACDC** can produce digital cousin scenes that preserve geometric and semantic affordances, and can be used to train policies that outperform policies trained on digital twins, achieving 90\% vs. 25\% under zero-shot sim-to-real transfer. Additional details are available at https://digital-cousins.github.io/.
Supplementary Material: zip
Submission Number: 104
Loading