Vulnerability-Aware Parameter-Efficient Fine-Tuning for Enhanced Adversarial Robustness

Abhinab Acharya; Yuansheng Zhu; Dayou Yu; Qi Yu; Xumin Liu

Vulnerability-Aware Parameter-Efficient Fine-Tuning for Enhanced Adversarial Robustness

Abhinab Acharya, Yuansheng Zhu, Dayou Yu, Qi Yu, Xumin Liu

19 Sept 2025 (modified: 14 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Adversarial robustness, parameter-efficient fine-tuning

Abstract: Pre-trained foundation models (PTMs) that undergo standard pre-training can be efficiently finetuned for downstream tasks using parameter-efficient fine-tuning (PEFT) methods. However, these models remain highly vulnerable to adversarial perturbations. Existing studies often distribute PEFT parameters uniformly across layers, which overlooks the varying importance of each layer. In this work, we systematically analyze the adversarial robustness of PEFT strategies and introduce a novel vulnerability score, a computationally efficient gradient-based measure that identifies which layers and components are most susceptible to adversarial attacks. Guided by this score, we design robustness-aware PEFT methods: LoRA High, which concentrates parameters in the most vulnerable layers, and LoRA+Adapter, which assigns LoRA to the attention component and adapters to the feed-forward component. Extensive adversarial-training experiments across four real-world image classification datasets show that these targeted PEFT designs consistently outperform vanilla PEFT methods. Post-adversarial finetuning analysis with pruning-style attribution score confirms that strategically protecting vulnerable parts of the backbone is key to robustness in PEFT.

Primary Area: alignment, fairness, safety, privacy, and societal considerations

Submission Number: 20222

Loading