Reasoning Scaffolding: Distilling the Flow of Thought from LLMs

Xiangyu Wen; Junhua Huang; Zeju Li; Min Li; Jianyuan Zhong; Zhijian Xu; Mingxuan Yuan; Yongxiang Huang; Qiang Xu

Reasoning Scaffolding: Distilling the Flow of Thought from LLMs

Xiangyu Wen, Junhua Huang, Zeju Li, Min Li, Jianyuan Zhong, Zhijian Xu, Mingxuan Yuan, Yongxiang Huang, Qiang Xu

Published: 26 Jan 2026, Last Modified: 26 Feb 2026ICLR 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: LLM Reasoning Distillation, Large Reasoning Model, Reasoning Scaffolding, Semantic Signals

TL;DR: We introduce Reasoning Scaffolding, a new reasoning distillation framework that transfers reasoning patterns—not just text—from large to small language models, resulting in stronger small reasoning models.

Abstract: The prevailing approach to distilling reasoning from Large Language Models (LLMs)—behavioral cloning from textual rationales—is fundamentally limited. It teaches Small Language Models (SLMs) to mimic surface-level patterns rather than the underlying algorithmic structure of thought, resulting in a critical lack of logical robustness. We argue that instead of cloning text, distillation should transfer this algorithmic structure directly. We introduce Reasoning Scaffolding, a framework that reframes reasoning as a structured generation process. Our method first abstracts the teacher's thought process into a sequence of discrete, interpretable semantic signals (e.g., Contrast, Addition) that act as a scaffold. The student model is then trained via a multi-task objective to both (1) predict the next semantic signal, anticipating the reasoning flow, and (2) generate the corresponding step, conditioned on that signal. This multi-task scheme acts as a powerful regularizer, compelling the student to internalize the computational patterns of coherent reasoning. On a suite of challenging reasoning benchmarks, our method significantly outperforms state-of-the-art distillation in both accuracy and logical consistency, providing a path towards creating smaller models that are genuine reasoners, not just fluent mimics.

Primary Area: foundation or frontier models, including LLMs

Submission Number: 10987

Loading