Unveiling the Intertwined Relationship Between Essential Sparsity and Robustness in Large Pre-trained Models

ICLR 2024 Workshop DMLR Submission50 Authors

Published: 04 Mar 2024, Last Modified: 02 May 2024DMLR @ ICLR 2024EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Pruning, Adversarial Robustness, Distribution Shifts, Pre-trained Models
Abstract: In the era of pre-trained LLMs, understanding their intrinsic sparse patterns becomes paramount, especially in the context of their scalability and efficiency. Recently, Jaiswal et al. (2023a) coined the concept of "essential sparsity" ($\mathtt{ES}$) which states the existence of a sharp turning point in the sparsity-performance curve when large pre-trained models are pruned using simple magnitude-based criteria. Despite significant attention to investigating how pruning impacts the performance of pre-trained models, its impact on adversarial robustness and distribution shifts has been overlooked. In this work, we extend the concept of $\mathtt{ES}$ to robustness $\mathtt{ES_{robust}}$, which illustrates the existence of a sharp turning point for robust performance. In comparison with clean performance, we found that sparsity tends to positively benefit the robust performance and $\mathtt{ES_{robust}}$ is observed at slightly higher sparsity than $\mathtt{ES}$. Our study presents a simple yet intriguing message that simple one-shot low-magnitude pruning is a powerful tool for identifying subnetworks that not only retain true performance but also robust performance on adversarial benchmarks. In addition, we found that carefully designed weight-importance criteria can further push the $\mathtt{ES_{robust}}$ to non-trivial sparsity ratios (e.g. 50-55\%). Moreover, we also extended our experiments across popular textual attacks (e.g. deletion, character swap, etc.) for distribution shifts, and found our observations related to $\mathtt{ES_{robust}}$ holds. All related codes will be open-sourced.
Primary Subject Area: Role of data in foundation models: pre-training, prompting, fine-tuning
Paper Type: Research paper: up to 8 pages
Participation Mode: Virtual
Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.
Submission Number: 50
Loading