Supra-Tuning: Combining Outlier and Low-Rank Adaptation for Sparse and Efficient LLM Fine-Tuning

Supra-Tuning: Combining Outlier and Low-Rank Adaptation for Sparse and Efficient LLM Fine-Tuning

ICLR 2026 Conference Submission25479 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: PEFT, Fine Tuning, LLM, Training, Deep Learning, AI, Language Models, Llama, Wanda, Outliers

TL;DR: We propose Super, a sparse fine-tuning method that updates only key outlier weights, and Supra, a hybrid that combines Super with LoRA.

Abstract: Large language models (LLMs) have demonstrated remarkable capabilities but remain expensive to fine-tune due to their size. Recent parameter-efficient tuning methods, such as Low-Rank Adaptation (LoRA), reduce the number of trainable parameters while maintaining performance. In this work, we introduce Super, a novel sparse adaptation technique that selects and trains only a small set of influential weights—so-called super weights—identified via outlier metrics such as WANDA. We show that fine-tuning these outlier weights yields strong performance with minimal parameter updates. Building on this idea, we propose Supra, a hybrid method that combines Super with LoRA, merging sparse and low-rank adaptations into a unified tuning strategy. Our experiments on several LLMs and downstream tasks demonstrate that both Super and Supra outperform existing sparse or low-rank methods alone in perplexity and task performance, while reducing computational and memory overhead. Supra-Tuning offers a simple yet powerful framework for efficient and scalable adaptation of LLMs.

Supplementary Material: zip

Primary Area: foundation or frontier models, including LLMs

Submission Number: 25479

Loading