Robust Fine-Tuning from Non-Robust Pretrained Models: Mitigating Suboptimal Transfer With Epsilon-Scheduling
Keywords: robust fine-tuning, robust generalization, pretraining, adversarial robustness, fine-tuning, transfer learning
TL;DR: Fine-tuning non-robust pretrained models with robust objectives suffers from suboptimal transfer, which we mitigate using Epsilon-Scheduling, leading to consistent gains in expected robustness.
Abstract: Fine-tuning pretrained models is the standard approach in current machine learning practice, but simultaneously achieving adversarial robustness to adversarial examples remains a challenge. Despite the abundance of non-robust pretrained models in open-source repositories, their use for Robust Fine-Tuning (RFT) remains understudied. This work aims to bridge this knowledge gap by systematically examining RFT from such models. Our experiments reveal that fine-tuning non-robust models with a robust objective, even under small perturbations, can lead to poor performance, a phenomenon that we dub \emph{suboptimal transfer}. In fact, we find that fine-tuning using a robust objective impedes task alignment at the beginning of training and eventually prevents optimal transfer. To promote optimal transfer, we propose \emph{Epsilon-Scheduling}, a simple heuristic scheduling over perturbation strength. Additionally, we introduce \emph{expected robustness}, a metric that measures performance across a range of perturbations. Experiments on six pretrained models and five datasets show that \emph{Epsilon-Scheduling} prevents \emph{suboptimal transfer} and consistently improves the expected robustness.
Submission Number: 147
Loading