Bi-Directional Goal-Conditioning on Single Value Function for State Space Search Problems

Published: 03 Nov 2023, Last Modified: 27 Nov 2023GCRL WorkshopEveryoneRevisionsBibTeX
Confirmation: I have read and confirm that at least one author will be attending the workshop in person if the submission is accepted
Keywords: Deep Reinforcement Learning, Search Problems, Goal-Conditioning, Search Algorithms
TL;DR: Incorporated bidirectional (from start and goal state) RL with multi-task goal conditioning to ensure one policy function to solve multi-tasks related to state space search problems.
Abstract: State space search problems have a binary (found/not found) reward system. In our work, we assume the ability to sample goal states and use the same to define a forward task $(\tau^*)$ and a backward task $(\tau^{inv})$ derived from the original state space search task to ensure more useful and learnable samples. Similar to Hindsight Relabelling, we define 'Foresight Relabelling' for reverse trajectories. We also use the agent's ability (from the policy function) to evaluate the reachability of intermediate states and use these states as goals for new sub-tasks. We group these tasks and sample generation strategies and make a single policy function (DQN) using goal-conditioning to learn all these different tasks and call it 'SRE-DQN’ (Scrambler-Resolver-Explorer). Finally, we demonstrate the advantages of bi-directional goal-conditioning and knowledge of the goal state by evaluating our framework on classical goal-reaching tasks, and comparing with existing concepts extended to our bi-directional setting.
Submission Number: 2