Keywords: Large Language Model, Knowledge Distillation, Alignment
TL;DR: ARTE: A framework that aligns the teacher model with student preferences to generate tailored training examples for Knowledge Distillation.
Abstract: Enhancing the reasoning abilities of lightweight language models (LMs) for tasks like decision-making often relies on instruction-tuning, a method that trains LMs to mimic the reasoning process using labeled question-rationale pairs, known as instruction-tuning datasets, which are typically generated by more powerful teacher LMs. However, current methods for generating these instruction-tuning datasets tend to focus solely on the quality of the questions and rationales from the teacher model’s perspective, often neglecting the learning preferences of the student language model. To fill this gap, we propose **ARTE** (**A**ligning Teache**R** with Studen**T** Preferenc**E**s), a novel framework that adapts the teacher LM’s outputs to the student’s preferences, inspired by "responsive teaching" in pedagogy. Our method involves three key steps: (1) generating draft question-rationale pairs from the teacher model, (2) collecting the student’s preferences on these draft pairs via one-shot in-context learning, and (3) aligning the teacher model using Direct Preference Optimization (DPO), then finally curating tailored question-rationale pairs from the aligned teacher for student training. Through extensive experiments on academic reasoning benchmarks, we demonstrate that student models fine-tuned with tailored datasets by ARTE achieve significant improvements across various reasoning tasks, outperforming existing instruction-tuning datasets. Moreover, we thoroughly investigate the generalization of ARTE, including the generalization of fine-tuned student models in reasoning ability and the generalization of aligned teacher models to generate tailored training data across tasks and students.
Primary Area: foundation or frontier models, including LLMs
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 5443
Loading