Distilling Reasoning into Student LLMs: Local Naturalness for Selecting Teacher Data

ICLR 2026 Conference Submission21056 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: reasoning distillation, math, teacher, llm
Abstract: Distilling long reasoning traces (10K+ tokens) from stronger teacher models into smaller student LLMs via supervised fine-tuning (SFT) has emerged as a standard paradigm. This approach is both practical and efficient: it leverages the ease of generating abundant reasoning data from stronger models and provides a direct, data-driven way to teach less capable models better reasoning. While previous work has largely focused on prompt selection with responses from a single teacher, the equally important problem of choosing the best response when multiple teacher outputs are available for a single prompt remains underexplored. This challenge becomes especially important in a multi-teacher setting, where different students may benefit from the outputs of different teachers. This paper fills that gap with a systematic study of response selection for reasoning distillation. We first show that the current method, which picks the response that the student assigns the highest global log-probability (i.e., global "naturalness"), fails when responses come from multiple teachers. In such cases, global naturalness no longer correlates with downstream performance, especially as the reasoning traces from strong teachers become longer. To overcome this limitation, we introduce Local Naturalness, which scores a response by measuring the student’s log-probabilities over short, sequential reasoning steps (e.g., sentences) conditioned only on a small local window. Local Naturalness enables two novel applications: 1) Teacher Selection: Aggregating local scores across prompts reliably identifies the most helpful teacher, whereas global scoring fails completely. 2) Response Selection from a Mixed-Teacher Dataset: When mixing answers from many teachers, Local Naturalness boosts a 32-billion-parameter student’s accuracy on math benchmarks by 9.4% over global- naturalness-based selection, also surpassing the performance achieved by training on data from the single best teacher. These results highlight the power of localized data quality evaluation and data mixing for more effective reasoning distillation.
Primary Area: generative models
Submission Number: 21056
Loading