ARTIST: Articulated Real-To-Interactive-Sim Twin

Published: 18 Sept 2025, Last Modified: 18 Sept 2025LSRW PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: robotics, digital-twins, real-to-sim-to-real
TL;DR: A real-to-sim-to-real method for automatically constructing digital twins of real-life articulated objects.
Abstract: Real-to-Sim-to-Real frameworks enable data-efficient robot learning by leveraging realistic simulations, but existing approaches struggle to reconstruct articulated objects without manual interaction or dense multi-view observations. We present ARTIST (\textbf{A}rticulated \textbf{R}eal-\textbf{T}o-\textbf{I}nteractive-\textbf{S}im \textbf{T}win), a framework that automatically builds digital twins of articulated objects from a single monocular video. ARTIST first reconstructs and decomposes objects into parts by combining monocular 3D reconstruction with open-vocabulary segmentation, and then estimates articulations by adapting an actor–critic vision–language model to operate on reconstructed parts. On the ArtVIP dataset, ARTIST improves both 3D asset reconstruction and articulation estimation for previously unseen real-world objects. Finally, we demonstrate that ARTIST enables Real-to-Sim-to-Real transfer by replaying a single robot demonstration in simulation, highlighting its potential for scalable robot learning with minimal supervision.
Serve As Reviewer: ~Lennard_Schuenemann2
Submission Number: 23
Loading