An Interactive Navigation Method with Effect-oriented Affordance

Xiaohan Wang, Yuehu Liu, Xinhang Song, Yuyi Liu, Sixian Zhang, Shuqiang Jiang

Published: 2024, Last Modified: 12 Dec 2024CVPR 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Visual navigation is to let the agent reach the target according to the continuous visual input. In most previous works, visual navigation is usually assumed to be done in a static and ideal environment: the target is always reachable with no need to alter the environment. However, the “messy” environments are more general and practical in our daily lives, where the agent may get blocked by obstacles. Thus Interactive Navigation (InterNav) is introduced to navigate to the objects in more realistic “messy” environments according to the object interaction. Prior work on InterNav learns shortterm interaction through extensive trials with reinforcement learning. However, interaction does not guarantee efficient navigation, that is, plan-ning obstacle interactions that make shorter paths and con-sume less effort is also crucial. In this paper, we introduce an effect-oriented affordance map to enable longterm interactive navigation, extending the existing map-based nav-igation framework to the domain of dynamic environment. We train a set of affordance functions predicting available interactions and the time cost of removing obstacles, which informatively support an interactive modular system to ad-dress interaction and longterm planning. Experiments on the ProcTHOR simulator demonstrate the capability of our affordance-driven system in longterm navigation in complex dynamic environments.