Does Low Rank Adaptation Lead to Lower Robustness against Training-Time Attacks?

Zi Liang; Haibo Hu; Qingqing Ye; Yaxin Xiao; RongHua Li

Does Low Rank Adaptation Lead to Lower Robustness against Training-Time Attacks?

Zi Liang, Haibo Hu, Qingqing Ye, Yaxin Xiao, RongHua Li

Published: 01 May 2025, Last Modified: 23 Jul 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY-NC-ND 4.0

TL;DR: This study explores the security risks of LoRA in the fine-tuning, demonstrating that a low-rank structure enhances robustness to backdoor attacks but increases susceptibility to untargeted data poisoning.

Abstract: Low rank adaptation (LoRA) has emerged as a prominent technique for fine-tuning large language models (LLMs) thanks to its superb efficiency gains over previous methods. While extensive studies have examined the performance and structural properties of LoRA, its behavior upon training-time attacks remain underexplored, posing significant security risks. In this paper, we theoretically investigate the security implications of LoRA's low-rank structure during fine-tuning, in the context of its robustness against data poisoning and backdoor attacks. We propose an analytical framework that models LoRA’s training dynamics, employs the neural tangent kernel to simplify the analysis of the training process, and applies information theory to establish connections between LoRA's low rank structure and its vulnerability against training-time attacks. Our analysis indicates that LoRA exhibits better robustness to backdoor attacks than full fine-tuning, while becomes more vulnerable to untargeted data poisoning due to its over-simplified information geometry. Extensive experimental evaluations have corroborated our theoretical findings.

Lay Summary: As a widely used parameter-efficient fine-tuning (PEFT) method and almost a standard in large model training, LoRA (Low Rank Apdation)'s features and variants have been extensively discussed. It has even been adapted for use in almost all fields in LLM's training. However, one question remains unexplored: Is LoRA inherently less secure than full fine-tuning when it comes to training-time attacks? In this work, we formally model two representative attacks (backdoor and untargeted poisoning attack) and use NTK (Neural Tangent Kernel) and information geometry theory to quantitatively analyze the security of LoRA versus full fine-tuning. Our findings reveal that LoRA exhibits better robustness to backdoor attacks than full fine-tuning, while becomes more vulnerable to untargeted data poisoning. Besides, it also demonstrates how two key factors in LoRA, initialization variance of matrix A and rank selection, affect its robustness. Beyond security, our analytical framework also provides a more intuitive and concise explanation for some of LoRA’s intriguing properties, such as: Why the two sub-matrices in LoRA are asymmetric? Why LoRA requires higher learning rates? Why LoRA’s initialization strategy matters? And other phenomena observed in prior research. If you’re interested in LoRA’s security analysis, theoretical foundations, or model architecture theory, please read this paper.

Link To Code: https://github.com/liangzid/LoRA-sSecurity

Primary Area: Social Aspects->Security

Keywords: Low Rank Adaptation, Large Language Models, Backdoor Attacks, Poisoning Attacks

Submission Number: 9801

Loading