Keywords: Knowledge Distillation, Reinforcement Fine-tuning
Abstract: Knowledge Distillation (KD) has been a cornerstone technique for accelerating Large Language Model (LLM) development by transferring knowledge from powerful teacher models to lightweight students. However, the efficacy of KD is not always guaranteed. Certain combinations of models and datasets have led to unexpected KD failure, which remains poorly understood. In this paper, we take a first step toward answering the fundamental question underlying these failures: What makes LLM undistillable? To this end, our first contribution is to identify and formalize the phenomenon we term as “distillation trap”, where teacher LLMs generate outputs that, despite being linguistically coherent, are nonsensical and misguide students during training. We further provide a theoretical motivation between this trap and KD dynamics of Kullback-Leibler (KL) divergence, the loss function central to most distillation protocols. Beyond elucidating the causes of KD failures, our second contribution is a control mechanism for LLMs’ distillability. We propose a novel methodology using Reinforcement Fine-tuning (RFT) to optimize a composite reward function. The reward function balances the teacher’s task capability with a confusion-based reward, which can be applied positively or negatively to either suppress or enhance the model’s amenability to distillation. By maximizing confusion reward, we deliberately construct “undistillable teachers”, effectively turning latent distillation traps into protective guards of model intellectual property (IP). Extensive experiments across four model pairs and four datasets demonstrate this approach’s effectiveness: our undistillable teachers retain their original performance while causing a catastrophic performance collapse (over 80% accuracy loss) in students trained with state-of-the-art distillation protocols. Our code can be found in supplementary material.
Supplementary Material: zip
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 8795
Loading