Distill Models by Aptitude: Efficient Reasoning Capability Distillation via Adaptive Data Curation and Overthinking Mitigation

Yuhan Song; Jing Jin; Tianyi Zhuang; Wenxuan Shi; Xiaoguang Li; Yasheng Wang; Xiaohui Yan; Houfeng Wang

Distill Models by Aptitude: Efficient Reasoning Capability Distillation via Adaptive Data Curation and Overthinking Mitigation

Yuhan Song, Jing Jin, Tianyi Zhuang, Wenxuan Shi, Xiaoguang Li, Yasheng Wang, Xiaohui Yan, Houfeng Wang

19 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: distillation, data-efficient training

TL;DR: This paper introduces DynaGuide, a novel framework that optimizes the distillation process in both efficiency and performance.

Abstract: The exponentially increasing computational demands of large language models (LLMs) facilitate the distillation of knowledge or capability to smaller models. Existing distillation attempts to transfer LLMs' reasoning capabilities to compact models face critical limitations, including expensive training or annotation costs, suboptimal data selection, and flawed synthetic data due to LLMs' general tendency to overthink. This paper introduces DynaGuide, a novel framework that optimizes the distillation process in both efficiency and performance. Our approach integrates (1) Dynamic Data Selection that adaptively performs fine-grained valuable data selection during the training process, and (2) Reasoning Pattern Guidance that mitigates the overthinking problem in synthetic data by incorporating specialized guidance during fine-tuning. Extensive experiments demonstrate that DynaGuide consistently achieves stable performance improvements across models of different series and parameter scales, with gains surpassing those of baseline methods on knowledge reasoning question answering benchmarks. Our systematic ablation studies and analysis further provide valuable insights into distillation and reasoning.

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Submission Number: 15492

Loading