Keywords: Action-Conditioned Scene Graph, Foundation Models for Robotics, Scene Exploration, Robotic Manipulation
Abstract: We introduce the novel task of interactive scene exploration, wherein
robots autonomously explore environments and produce an action-conditioned
scene graph (ACSG) that captures the structure of the underlying environment.
The ACSG accounts for both low-level information (geometry and semantics)
and high-level information (action-conditioned relationships between different
entities) in the scene. To this end, we present the Robotic Exploration (RoboEXP)
system, which incorporates the Large Multimodal Model (LMM) and an explicit
memory design to enhance our system’s capabilities. The robot reasons about
what and how to explore an object, accumulating new information through the
interaction process and incrementally constructing the ACSG. Leveraging the
constructed ACSG, we illustrate the effectiveness and efficiency of our RoboEXP
system in facilitating a wide range of real-world manipulation tasks involving
rigid, articulated objects, nested objects, and deformable objects. Project Page:
https://jianghanxiao.github.io/roboexp-web/
Submission Number: 40
Loading