Fusing vision and contact-rich physics improves object reconstruction under occlusion

Published: 21 Jun 2025, Last Modified: 21 Jun 2025SWOMO RSS25 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: geometry reconstruction, contact-rich physics, dynamics model learning, robotic manipulation, system identification
TL;DR: Vysics is a vision-and-physics framework for a robot to build an expressive geometry and dynamics model of a single rigid body, using a seconds-long RGBD video and the robot's proprioception.
Abstract: We introduce Vysics, a vision-and-physics framework for building an expressive geometry and dynamics model of a rigid body, using a seconds-long RGBD video and robot proprioception. While the computer vision community has built powerful visual 3D perception algorithms, cluttered environments can limit visibility of objects of interest. However, observed motion of partially occluded objects can imply physical interactions took place, such as robot or environment contacts. Inferred contacts supplement the visible geometry with “physible” geometry, which best explains the observed object motion through physics. Vysics uses a vision-based tracking and reconstruction method, BundleSDF, to estimate the trajectory and visible geometry from an RGBD video, and an odometry-based model learning method, Physics Learning Library (PLL), to infer the “physible” geometry from the trajectory through implicit contact dynamics optimization. The visible and “physible” geometries jointly optimize the object's signed distance function (SDF). Vysics does not require pretraining, nor tactile or force sensors. Compared to vision-only, Vysics yields object models with higher geometric accuracy and better dynamics prediction in experiments where the object interacts with the robot and environment under heavy occlusion. Project page: https://vysics-vision-and-physics.github.io/
Submission Number: 7
Loading