Keywords: LLM Reasoning Distillation, Large Reasoning Model, Reasoning Scaffolding, Semantic Signals
TL;DR: We introduce Reasoning Scaffolding, a new reasoning distillation framework that transfers reasoning patterns—not just text—from large to small language models, resulting in stronger small reasoning models.
Abstract: The prevailing approach to distilling reasoning from Large Language Models (LLMs)—behavioral cloning from textual rationales—is fundamentally limited. It teaches Small Language Models (SLMs) to mimic surface-level patterns rather than the underlying algorithmic structure of thought, resulting in a critical lack of logical robustness. We argue that instead of cloning text, distillation should transfer this algorithmic structure directly. We introduce Reasoning Scaffolding, a framework that reframes reasoning as a structured generation process. Our method first abstracts the teacher's thought process into a sequence of discrete, interpretable semantic signals (e.g., Contrast, Addition) that act as a scaffold. The student model is then trained via a multi-task objective to both (1) predict the next semantic signal, anticipating the reasoning flow, and (2) generate the corresponding step, conditioned on that signal. This multi-task scheme acts as a powerful regularizer, compelling the student to internalize the computational patterns of coherent reasoning. On a suite of challenging reasoning benchmarks, our method significantly outperforms state-of-the-art distillation in both accuracy and logical consistency, providing a path towards creating smaller models that are genuine reasoners, not just fluent mimics.
Primary Area: foundation or frontier models, including LLMs
Submission Number: 10987
Loading