Keywords: peft, llm, fine-tuning, quantum, compound, adapters, lora, parameter efficient fine tuning, large language models, low rank adaptation, llama
TL;DR: The paper introduces QuIC Adapters, a parameter-efficient fine-tuning method using compound operations inspired from quantum machine learning literature that offers a promising approach for efficient model adaptation.
Abstract: Scaling full fine-tuning of large foundation models strains GPU memory and training time. Parameter Efficient Fine-Tuning (PEFT) methods address this issue via adapter modules which update only a small subset of model parameters. In this work, we introduce Quantum-Inspired Compound Adapters (QuIC Adapters), a PEFT approach inspired from Hamming-weight preserving quantum circuits that can effectively fine-tune a model using less than $0.02\%$ memory footprint of the base model. QuIC adapters preserve pretrained representations by enforcing orthogonality in weight parameters, and have native deployment mechanisms on quantum computers. We test QuIC adapters by fine-tuning large language models like LLaMA and vision transformers on language, math, reasoning and vision benchmarks. In its first-order configuration, QuIC recovers the performance of existing orthogonal methods, while higher-order configurations enable substantial parameter compression (over $40\times$ smaller than LoRA) for a modest performance trade-off, unlocking applications in highly resource-constrained environments. Through ablation studies, we determine that combining multiple Hamming-weight orders with orthogonality and matrix compounding are essential for performant fine-tuning. Our findings suggest that QuIC adapters offers a promising direction for efficient fine-tuning of foundation models in resource-constrained environments.
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 16351
Loading