Learning to Reason with Transformers via Search Inductive Biases: A Proposal

Published: 13 Dec 2024, Last Modified: 23 Feb 2025LM4PlanEveryoneRevisionsBibTeXCC0 1.0
Keywords: transformers, search, reasoning, test-time compute
TL;DR: We propose Search Transformers, a novel transformer architecture with adaptive test-time compute that learns to carry out a search process in order to improve its reasoning abilities.
Abstract: Large Language Models have revolutionized the field of AI. Most recently, with the advent of Large Reasoning Models like OpenAI's o1 (Strawberry), they are becoming increasingly proficient at reasoning tasks, such as math, computer programming and Sequential Decision Making (e.g., Automated Planning). In this preliminary work, we present an alternative approach for learning to reason: instead of performing reasoning at the LLM-level, we propose to do so at the transformer-level. To achieve this, we introduce the Search Transformer, a novel neural architecture that enhances the transformer model with a search inductive bias, thus allowing it to perform variable test-time computation. We formulate search operations (e.g., node selection and successor generation) in terms of differentiable, attention-based computations, in order to learn a search process end-to-end using back-propagation. By learning to search, we believe Search Transformers will adquire promising System-2 capabilities, thus surpassing the performance of standard transformers at reasoning-related tasks.
Submission Number: 16
Loading