- Abstract: We present a method for policy learning to navigate indoor environments. We adopt a hierarchical policy approach, where two agents are trained to work in cohesion with one another to perform a complex navigation task. A Planner agent operates at a higher level and proposes sub-goals for an Executor agent. The Executor reports an embedding summary back to the Planner as additional side information at the end of its series of operations for the Planner's next sub-goal proposal. The end goal is generated by the environment and exposed to the Planner which then decides which set of sub-goals to propose to the Executor. We show that this Planner-Executor setup drastically increases the sample efficiency of our method over traditional single agent approaches, effectively mitigating the difficulty accompanying long series of actions with a sparse reward signal. On the challenging Habitat environment which requires navigating various realistic indoor environments, we demonstrate that our approach offers a significant improvement over prior work for navigation.
- Keywords: Hierarchical Reinforcement Learning, Embodied Learning, Navigation
- TL;DR: We present a hierarchical learning framework for navigation within an embodied learning setting