Agent with the Big Picture: Perceiving Surroundings for Interactive Instruction FollowingDownload PDF

29 Sept 2023OpenReview Archive Direct UploadReaders: Everyone
Abstract: We address the interactive instruction following task which requires an agent to navigate through an environment, interact with objects, and complete long-horizon tasks, following natural language instructions with egocentric vision. To successfully achieve a goal in the interactive instruction following task, the agent should infer a sequence of actions and object interactions. When performing actions, a small field of view often limits the agent’s understanding of an environment, leading to poor performance. Here, we propose to exploit surrounding views by additional observations from navigable directions to enlarge the field of view of the agent.
0 Replies

Loading