Keywords: foundation models, generalization in planning, deep reinforcement learning, pathfinding
TL;DR: We present DeepCubeAF, a method for training a foundation model for heuristic functions that generalize across pathfinding domains.
Abstract: Pathfinding problems can be found in fields such as robotics, mathematics, chemistry, and program synthesis, where the objective of pathfinding is to find a sequence of actions that transforms a given start state into a goal state. Recently, deep reinforcement learning (DRL) has emerged as a promising method for automatically training domain-specific heuristic functions to solve these problems in a largely domain-independent fashion. However, these approaches often require retraining for even a slight change in domain, resulting in significant resource and time inefficiencies. While existing approaches use supervised learning to learn generalizable heuristics to handle unseen domains, they are limited by the need to obtain supervised labels. We draw inspiration from domain randomization in reinforcement learning to handle these limitations and the DeepCubeA algorithm and introduce DeepCubeA for foundation models (DeepCubeAF). DeepCubeAF trains a heuristic function across randomly generated domains using reinforcement learning and uses this trained heuristic function with batch weighted A* search to solve problems. Our model consistently shows better generalizability than the existing foundation model for both seen and unseen domains. This work represents a step toward training robust, generalizable models and providing access to these models to experts across various fields.
Submission Number: 235
Loading