Planning by Active Sensing

Published: 27 Oct 2023, Last Modified: 20 Nov 2023Gaze Meets ML 2023 OralEveryoneRevisionsBibTeX
Submission Type: Full Paper
Keywords: model-based reinforcement learning, sequential decision-making, navigation, maze, gaze
TL;DR: By analyzing sequential gaze patterns during navigation, we find that humans plan their paths incrementally by using active sensing to build a partial map of the environment.
Abstract: Flexible behavior requires rapid planning, but planning requires a good internal model of the environment. Learning this model by trial-and-error is impractical when acting in complex environments. How do humans plan action sequences efficiently when there is uncertainty about model components? To address this, we asked human participants to navigate complex mazes in virtual reality. We found that the paths taken to gather rewards were close to optimal even though participants had no prior knowledge of these environments. Based on the sequential eye movement patterns observed when participants mentally compute a path before navigating, we develop an algorithm that is capable of rapidly planning under uncertainty by active sensing i.e., visually sampling information about the structure of the environment. ew eye movements are chosen in an iterative manner by following the gradient of a dynamic value map which is updated based on the previous eye movement, until the planning process reaches convergence. In addition to bearing hallmarks of human navigational planning, the proposed algorithm is sample-efficient such that the number of visual samples needed for planning scales linearly with the path length regardless of the size of the state space.
Submission Number: 25
Loading