Dialectical Chain Distillation: Transferring Dialectical Reasoning from Teacher–Student Interactions to Small Language Models

Published: 07 Jun 2025, Last Modified: 05 Aug 2025Practical-DL 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Model Compression, Knowledge Distillation
TL;DR: A novel knowledge distillation framework that enhances the reasoning capability of LLMs through structured teacher-student interactions.
Abstract: While Large Language Models (LLMs) have become widely used in natural language processing, their deployment remains challenging in resource-constrained environments due to substantial computational requirements. Model compression techniques such as pruning, quantization, and knowledge distillation are commonly employed to reduce resource burden. However, these methods often compromise model robustness and multi-step reasoning ability. In this paper, we propose Dialectical Chain Distillation (DCD), a novel knowledge distillation framework that enhances the reasoning capability of LLMs through structured teacher-student interactions. DCD constructs dialectical reasoning chains involving drafting, deep reasoning, verification, and finalization, which provide informative and interpretable supervision for training student models. Experimental results on AIME 24, GSM8K and GPQA Diamond demonstrate that DCD improves both reasoning accuracy and robustness compared to standard Chain-of-Thought distillation methods, highlighting its effectiveness in producing more reliable compressed LLMs.
Submission Number: 7
Loading