Towards Efficient CoT Distillation: Self-Guided Rationale Selector for Better Performance with Fewer Rationales
Abstract: CoT distillation is critical for enhancing small language models' (SLMs) reasoning by transferring multi-step reasoning capability from the larger teacher models. However, existing work underestimates the importance of rationale quality, focusing primarily on data quantity, which may result in transferring noisy or incorrect information to the student model. To address the above issues, we proposed Model-Oriented Rationale Selection Distillation (MoRSD), which can discern and select high quality rationales for distillation. We further propose a Rationale Difficulty (RD) metric to measure the ability of the student model to generate the correct answer under a given rationale. Compared to the baseline, we achieved 4.6 average improvement on seven datasets over three tasks, using fewer rationales by controlling their accuracy, diversity, and difficulty. Our results reveal that a small portion of the high quality rationales can enhance the reasoning ability of student models than the entire dataset. Our method promises to be a possible solution for efficient CoT distillation. Our code will be released to facilitate reproducibility and future research in data efficiency.
Paper Type: Long
Research Area: Efficient/Low-Resource Methods for NLP
Research Area Keywords: distillation, data-efficient training, LLM Efficiency
Contribution Types: Reproduction study, Approaches to low-resource settings, Data analysis
Languages Studied: English
Submission Number: 4846
Loading