Keywords: neural compilation, algorithms, large language models, reasoning, planning
TL;DR: In this workshop paper, we study the feasibility of augmenting LLaMA 3 with a library of differentiable programs that provide reasoning ability.
Abstract: Important reasoning tasks such as planning are fundamentally algorithmic, meaning that solving these tasks robustly requires inducing the underlying algorithms, rather than shortcuts. Large Language Models lack true algorithmic ability primarily because of the limitations of neural network optimization algorithms, their optimization data and optimization objective, but also due to the inexpressivity of the transformer architecture. To address this lack of algorithmic ability, our paper proposes augmenting LLMs with an internal reasoning module. This module contains a library of fundamental operations and sophisticated differentiable programs, so that common algorithms do not need to be learned from scratch. To accomplish this, we add memory, registers, basic operations, and adaptive recurrence to a billion-parameter scale transformer architecture built on LLaMA3.2. Then, we define a method for directly compiling algorithms into a differentiable starting library, which is used natively and propagates gradients for optimization. In this
workshop paper, we study the feasibility of this augmentation by fine-tuning a small transformer on simple algorithmic tasks with variable computational depth.
Submission Number: 69
Loading