Research Area: Data, Science of LMs
Keywords: Planning, Sequential Decision Making, Reasoning
TL;DR: We demonstrate how to train transformers to solve complex planning tasks and how they can be used to find more efficient search methods.
Abstract: While Transformers have enabled tremendous progress in various application settings, such architectures still struggle with solving planning and sequential decision-making tasks. In this work, we demonstrate how to train Transformers to solve complex planning tasks. This is accomplished by first designing a synthetic language that captures the computation performed by the $A^*$ search algorithm when solving a planning task. Then, an encoder-decoder Transformer model is trained to predict this language, resulting in a language model that can correctly solve novel planning tasks by generating $A^*$'s search dynamics. We fine tune this model to obtain a Searchformer, a Transformer model that optimally solves previously unseen Sokoban puzzles 93.7\% of the time, while using up to 26.8\% fewer search steps than our $A^*$ reference implementation. Searchformer significantly outperforms baselines that predict the optimal plan directly with a 5-10$\times$ smaller model size and a 10$\times$ smaller training dataset. Lastly, we demonstrate how Searchformer scales to larger and more complex decision making tasks with improved percentage of solved tasks and shortened search dynamics.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the COLM Code of Ethics on https://colmweb.org/CoE.html
Author Guide: I certify that this submission complies with the submission instructions as described on https://colmweb.org/AuthorGuide.html
Submission Number: 471
Loading