Fit-LoRA: Fit Your LoRAs to Pruned LLMs Without Additional Training or Data

20 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Parameter Efficient Fine-Tuning, Pruning, Sparsity, Large Language Models, Portability, Low-Rank Adaptation
Abstract: Personalization of LLMs via fine-tuning has become a popular way to enhance performance on downstream tasks. However, the model adaptation obtained after fine-tuning is specific to the base model. Any modifications made to the structure of the base model require users to fine-tune on the downstream task again. During deployment, a base model may be modified using pruning to obtain several LLM scales tailored to specific compute requirements. In this scenario, it becomes challenging to keep up with personalization, since each derived model must be individually fine-tuned. To address this challenge, we explore the possibility of leveraging the base model's fine-tuned knowledge to personalize any derived models. In this paper, we present Fit-LoRA, a framework that enables fine-tuning knowledge transfer between a base LLM and derived LLMs of smaller scales without needing any training or access to the original fine-tuning data. We validate our approach by conducting extensive experiments covering representative datasets such as BoolQ, SST-2, MRPC, RTE, and WinoGrande, across various model architectures including Llama-2, Llama-3.1, Mistral, and Gemma-2. Furthermore, we show the effectiveness of our approach by demonstrating its compatibility across multiple types of state-of-the-art LLM pruning methods, including depth pruning, structured pruning, and sparsification.
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 24065
Loading