Semantic Policy Network for Zero-Shot Object Goal Visual Navigation

Published: 01 Jan 2023, Last Modified: 11 Apr 2025IEEE Robotics Autom. Lett. 2023EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: The task of zero-shot object goal visual navigation (ZSON) aims to enable robots to locate previously “unseen” objects by visual observations. This task presents a significant challenge since the robot must transfer the navigation policy learned from “seen” objects to “unseen” objects through auxiliary semantic information without training samples, a process known as zero-shot learning. In order to address this challenge, we propose a novel approach termed the Semantic Policy Network (SPNet). The SPNet consists of two modules that are deeply integrated with semantic embeddings: the Semantic Actor Policy (SAP) module and the Semantic Trajectory (ST) module. The SAP module generates actor network weight bias based on semantic embeddings, creating unique navigation policies for different target classes. The ST module records the robot's actions, visual features, and semantic embeddings at each step, and aggregates information in both the spatial and temporal dimensions. To evaluate our approach, we conducted extensive experiments using MP3D dataset, HM3D dataset, and RoboTHOR. Experimental results indicate that the proposed method outperforms other ZSON methods for both seen and unseen target classes.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview