Generating Multi-hierarchy Scene Graphs for Human-instructed Manipulation Tasks in Open-world Settings

Published: 16 Apr 2024, Last Modified: 02 May 2024MoMa WS 2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Scene Graphs, Robot Manipulation, Foundation Models
TL;DR: Proposing a method for generating scene graphs for open-world scenarios
Abstract: For generating viable multi-step plans in robotics, it is necessary to have a representation scheme for scenes that is both open-set and structured in a way that facilitates local updates when the scene changes. We propose a method for generating multi-hierarchical scene graphs in a zero-shot manner using foundation models, which can support downstream planning tasks. We demonstrate that our method yields superior results compared to previous works in both open-world object detection and relation extraction, even without any priors. Moreover, we illustrate how the multi-hierarchical nature of the scene graph aids the planner in devising feasible plans for tasks necessitating reasoning over the spatial arrangements and object category abstractions.
Submission Number: 16
Loading