Semantically-Driven Object Search Using Partially Observed 3D Scene Graphs

Published: 07 Nov 2023, Last Modified: 04 Dec 2023FMDM@NeurIPS2023EveryoneRevisionsBibTeX
Keywords: Planning under uncertainty, object search, 3D scene graphs, language model, POMDP
TL;DR: We introduce a framework leveraging 3D scene graphs to formulate a new POMDP describing object search problems, as well as an online sampling-based planner to solve these POMDPs that utilizes language model embeddings for belief construction.
Abstract: Object search is a fundamental task for service robots aiding humans in their daily lives. For example, a robot must locate a cup before pouring coffee, or locate a sponge before cleaning up a spill. As such, robots performing object search across many different and potentially unseen environments must reason about uncertainty in both environment layout and object location. In this work, we frame object search as a Partially Observable Markov Decision Process (POMDP), and propose a generalizable planner that combines the structured representations afforded by 3D scene graphs with the semantic knowledge of language models. Specifically, we introduce (i) 3DSG-POMDPs, which are POMDPs defined over 3D scene graphs that reduce the dimensionality of object search, and (ii) PROPHE-C, a sampling-based planner for solving 3DSG-POMDPS. We demonstrate the efficacy of PROPHE-C in a partially observable household environment, revealing that additional online inference leads to more efficient and exploratory search plans, compared to solely relying on language models for decision-making.
Submission Number: 57