Mistake-assisted Distillation: Enhancing Student’s CoT Capabilities by Identifying Key Reasoning Steps
Abstract: With the scaling up of model parameters, powerful reasoning capabilities have emerged in Large Language Models (LLMs). However, resource constraints in practical applications pose challenges to the deployment of such models, which prompted a lot of attention to distilling the capabilities into smaller, compact language models. Prior distillation works simply fine-tune student models on Chain-of-Thoughts (CoTs) data generated by teacher LLMs, resulting in the student merely imitating the teacher's reasoning style without capturing the key in reasoning. In this paper, we propose a novel distillation method called \textbf{Mis}take-\textbf{A}ss\textbf{i}sted \textbf{D}istillation (MisAiD) to help students identify the key reasoning steps and learn the thinking way in reasoning. Specifically, we first retain all CoT data annotated by teacher LLMs, irrespective of correctness. Then, we design specific prompts to rectify teachers' wrong CoTs and mistake the correct CoTs, respectively, forming the dual CoTs data that have similar reasoning steps but divergent conclusions. Finally, we identify the key reasoning steps in dual CoTs and employ a fine-grained loss function to guide student learning. Extensive experiments and comprehensive analyses demonstrate the effectiveness of MisAiD on both in-domain and out-of-domain benchmark reasoning datasets.
Paper Type: long
Research Area: Efficient/Low-Resource Methods for NLP
Contribution Types: NLP engineering experiment, Approaches low compute settings-efficiency
Languages Studied: English
0 Replies
Loading