Learning Discrete World Models for Classical Planning Problems

Published: 28 Oct 2023, Last Modified: 18 Dec 2023GenPlan'23EveryoneRevisionsBibTeX
Abstract: For many sequential decision making domains, planning is often necessary to solve problems. However, for domains such as those encountered in robotics, the transition function, also known as the world model, is often unknown and coding such a model by hand is often impractical. While planning could be done with a world model trained from observed transitions, such approaches are limited by errors accumulating when the model is applied across many timesteps as well as the inability to re-identify states. Furthermore, even given an accurate world model, domain-independent planning methods may not be able to reliably solve problems while domain-specific information required to construct informative heuristics may not be readily available. While methods exist that can learn domain-specific heuristic functions in a largely domain-independent fashion, such as DeepCubeA, these methods assume a given world model and may also assume that the goal is predetermined. To solve these problems, we introduce DeepCubeAI, a domain-independent algorithm that learns a world model that represents states in a discrete latent space, learns a heuristic function that generalizes over start and goal states using this learned model, and combines the learned model and learned heuristic function with search to solve problems. Since the latent space is discrete, we can prevent the accumulation of small errors by rounding and we can re-identify states by simply comparing two binary vectors. In our experiments on a pixel representation of the Rubik's cube and Sokoban, we find that DeepCubeAI is able to apply the model for thousands of steps without accumulating any error. Furthermore, DeepCubeAI solves over 99% of test instances in all domains and generalizes across goal states.
Submission Number: 55