Towards Efficient CoT Distillation: Self-Guided Rationale Selector for Better Performance with Fewer Rationales

Towards Efficient CoT Distillation: Self-Guided Rationale Selector for Better Performance with Fewer Rationales

ACL ARR 2025 May Submission4846 Authors

20 May 2025 (modified: 03 Jul 2025)ACL ARR 2025 May SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: CoT distillation is critical for enhancing small language models' (SLMs) reasoning by transferring multi-step reasoning capability from the larger teacher models. However, existing work underestimates the importance of rationale quality, focusing primarily on data quantity, which may result in transferring noisy or incorrect information to the student model. To address the above issues, we proposed Model-Oriented Rationale Selection Distillation (MoRSD), which can discern and select high quality rationales for distillation. We further propose a Rationale Difficulty (RD) metric to measure the ability of the student model to generate the correct answer under a given rationale. Compared to the baseline, we achieved 4.6 average improvement on seven datasets over three tasks, using fewer rationales by controlling their accuracy, diversity, and difficulty. Our results reveal that a small portion of the high quality rationales can enhance the reasoning ability of student models than the entire dataset. Our method promises to be a possible solution for efficient CoT distillation. Our code will be released to facilitate reproducibility and future research in data efficiency.

Paper Type: Long

Research Area: Efficient/Low-Resource Methods for NLP

Research Area Keywords: distillation, data-efficient training, LLM Efficiency

Contribution Types: Reproduction study, Approaches to low-resource settings, Data analysis

Languages Studied: English

Submission Number: 4846

Loading