DPLoRA: A Dual-Pruning Framework based on ILP Optimization and Progressive Pruning for Parameter-Efficient LoRA Fine-Tuning
Keywords: Parameter-Efficient Fine-Tuning, Integer Linear Programming, Dual Pruning Framework, Low-Rank Adaptation, Large Language Models
TL;DR: Dual Pruning Framework based on ILP
Abstract: We propose DPLoRA (Dual-Pruning Low-Rank Adaptation), an optimized Low-Rank Adaptation (LoRA) method for parameter-efficient fine-tuning of large language models. Our approach introduces a two-stage compression framework: (1) an initial pruning stage, OPLoRA, that formulates a first ILP problem to automatically discover the optimal layer-wise LoRA rank ($r$) configuration before training; (2) a progressive pruning stage that formulates a second ILP problem during training, incorporating Exponential Moving Average (EMA) of layer-wise importance scores to further reduce rank ($r$) adaptively. On the GLUE benchmark, our first stage, OPLoRA, achieves a new state-of-the-art (SOTA) performance, surpassing all baselines. Furthermore, the full DPLoRA framework also demonstrates superior capabilities, outperforming strong PEFTs like AdaLoRA and SoRA while achieving up to an 80\% reduction in trainable parameters and a 50\% reduction in training time. This study offers a new direction for efficiently deploying large-scale language models in resource-constrained environments.
Supplementary Material: zip
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 18653
Loading