DynaGuide: Efficient Reasoning Capability Distillation via Adaptive Data Curation and Overthinking Mitigation

DynaGuide: Efficient Reasoning Capability Distillation via Adaptive Data Curation and Overthinking Mitigation

ACL ARR 2025 May Submission4771 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: The exponentially increasing computational demands of large language models (LLMs) facilitate the distillation to small models. Existing distillation attempts to transfer LLMs' reasoning capabilities to compact models face critical limitations: expensive training or annotation cost, suboptimal data selection, and flawed synthetic data due to LLMs' general overthinking behaviors. This paper introduces DynaGuide, a novel framework that optimizes the distillation process in both efficiency and performance. Our approach integrates (1) Dynamic Data Selection that adaptively performs fine-grained valuable data selection during the training process, and (2) Reasoning Pattern Guidance that mitigates the overthinking problem in synthetic data by incorporating specialized guidance during fine-tuning. Extensive experiments demonstrate that DynaGuide enables a 7B parameter model to achieve superior performance on knowledge reasoning question answering benchmarks, even achieving or exceeding its 32B counterpart. Our systematic ablation studies and analysis further reveal insights into distillation and reasoning.

Paper Type: Long

Research Area: Efficient/Low-Resource Methods for NLP

Research Area Keywords: distillation, data-efficient training

Contribution Types: NLP engineering experiment, Approaches low compute settings-efficiency

Languages Studied: English

Submission Number: 4771

Loading