Student First Author: yes
Keywords: Landmark-Based Navigation, Incremental Topological Memory, Visual Navigation
TL;DR: A landmark-based topological semantic graph memory for image goal navigation (TSGM) is proposed. It significantly outperforms baselines, boosting path efficiency (SPL). The proposed method is demonstrated in real environment with a jackal robot.
Abstract: A novel framework is proposed to incrementally collect landmark-based graph memory and use the collected memory for image goal navigation. Given a target image to search, an embodied robot utilizes semantic memory to find the target in an unknown environment. In this paper, we present a topological semantic graph memory (TSGM), which consists of (1) a graph builder that takes the observed RGB-D image to construct a topological semantic graph, (2) a cross graph mixer module that takes the collected nodes to get contextual information, and (3) a memory decoder that takes the contextual memory as an input to find an action to the target. On the task of an image goal navigation, TSGM significantly outperforms competitive baselines by +5.0-9.0% on the success rate and +7.0-23.5% on SPL, which means that the TSGM finds efficient paths. Additionally, we demonstrate our method on a mobile robot in real-world image goal scenarios.
Supplementary Material: zip