Transplanting Knowledge: A Study on Layer-Specific Grafting in LLMs

AAAI 2025 Workshop CoLoRAI Submission12 Authors

22 Nov 2024 (modified: 03 Feb 2025)AAAI 2025 Workshop CoLoRAI SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Model stitch, Model Merging, Grafting
TL;DR: We develop an approach for transferring specialized capabilities between LLMs by grafting specific layers. We then investigate key factors influencing grafting success.
Abstract: This paper introduces \textit{layer-specific grafting}, a novel approach for transferring specialized capabilities between Large Language Models (LLMs). We specialize individual layers by fine-tuning on a task and grafting them into a host model. Using modular addition as a test case, we demonstrate the feasibility of grafting task-specific functionality while evaluating the preservation of the host's general language generation capabilities. To ensure smooth integration, we employ translator modules to align residual streams between models. Experiments reveal key factors influencing grafting success, including layer selection, dataset alignment, and model size. While the method effectively transfers task adherence, challenges persist in achieving strict format retention and maintaining fluency. These findings establish layer grafting as a promising tool for modular and efficient AI system development.
Submission Number: 12
Loading