A Simple Approach for Visual Rearrangement: 3D Mapping and Semantic Search

Brandon Trabucco; Gunnar A Sigurdsson; Robinson Piramuthu; Gaurav S. Sukhatme; Ruslan Salakhutdinov

A Simple Approach for Visual Rearrangement: 3D Mapping and Semantic Search

Brandon Trabucco, Gunnar A Sigurdsson, Robinson Piramuthu, Gaurav S. Sukhatme, Ruslan Salakhutdinov

Published: 23 Jun 2022, Last Modified: 03 Nov 2024L-DOD 2022 PosterReaders: Everyone

Keywords: Embodied AI, Deep Learning, Rearrangement

Abstract: Physically rearranging objects is an important capability for embodied agents. Visual room rearrangement evaluates an agent's ability to rearrange objects in a room to a desired goal based solely on visual input. We propose a simple yet effective method for this problem: (1) search for and map which objects need to be rearranged, and (2) rearrange each object until the task is complete. Our approach consists of an off-the-shelf semantic segmentation model, voxel-based semantic map, and semantic search policy to efficiently find objects that need to be rearranged. On the AI2-THOR Rearrangement Challenge, our method improves on current state-of-the-art end-to-end reinforcement learning-based methods that learn visual rearrangement policies from 0.53\% correct rearrangement to 15.11\%, using only 2.7\% as many samples from the environment.

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/a-simple-approach-for-visual-rearrangement-3d/code)

0 Replies

Loading