Mistake-assisted Distillation: Enhancing Student’s CoT Capabilities by Identifying Key Reasoning StepsDownload PDF

Anonymous

16 Feb 2024ACL ARR 2024 February Blind SubmissionReaders: Everyone
Abstract: With the scaling up of model parameters, powerful reasoning capabilities have emerged in Large Language Models (LLMs). However, resource constraints in practical applications pose challenges to the deployment of such models, which prompted a lot of attention to distilling the capabilities into smaller, compact language models. Prior distillation works simply fine-tune student models on Chain-of-Thoughts (CoTs) data generated by teacher LLMs, resulting in the student merely imitating the teacher's reasoning style without capturing the key in reasoning. In this paper, we propose a novel distillation method called \textbf{Mis}take-\textbf{A}ss\textbf{i}sted \textbf{D}istillation (MisAiD) to help students identify the key reasoning steps and learn the thinking way in reasoning. Specifically, we first retain all CoT data annotated by teacher LLMs, irrespective of correctness. Then, we design specific prompts to rectify teachers' wrong CoTs and mistake the correct CoTs, respectively, forming the dual CoTs data that have similar reasoning steps but divergent conclusions. Finally, we identify the key reasoning steps in dual CoTs and employ a fine-grained loss function to guide student learning. Extensive experiments and comprehensive analyses demonstrate the effectiveness of MisAiD on both in-domain and out-of-domain benchmark reasoning datasets.
Paper Type: long
Research Area: Efficient/Low-Resource Methods for NLP
Contribution Types: NLP engineering experiment, Approaches low compute settings-efficiency
Languages Studied: English
0 Replies

Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview