A Closer Look at Model Adaptation using Feature Distortion and Simplicity BiasDownload PDF

Published: 01 Feb 2023, 19:23, Last Modified: 01 Feb 2023, 19:23ICLR 2023 notable top 25%Readers: Everyone
Keywords: Transfer Learning, Robustness, Adaptation, Data Augmentation
TL;DR: Mitigating feature distortion is not enough to ensure that transfer learning from large-scale, pretrained models leads to better safety and generalization on downstream tasks.
Abstract: Advances in the expressivity of large-scale pretrained models have increased interest in the design of adaptation protocols which enable safe and effective transfer learning. Going beyond conventional linear probing (LP) and fine tuning (FT) strategies, protocols that can effectively control feature distortion, i.e., the failure to update features orthogonal to the in-distribution, during FT have been found to achieve improved out-of-distribution generalization. A popular example is the recent LP+FT protocol which first learns a linear probe and then uses that initialization during FT. However, in this paper, we find that when adaptation protocols are also evaluated on a variety of safety objectives (e.g., calibration, robustness etc.), that a complementary perspective to feature distortion is required explain protocol behavior. To this end, we study the susceptibility of protocols to simplicity bias (SB), i.e. the well-known propensity of neural networks to rely upon simple features, as SB has recently been shown to underlie several problems in robust generalization. Using a synthetic dominoes dataset obtained by pairing (complex) CIFAR10 with (simple) MNIST samples, we demonstrate that the susceptibility of existing protocols to SB. Given the strong effectiveness of LP+FT, we propose incorporating hardness-promoting perturbations during LP to obtain initializations for FT that further decrease SB. We verify the effectiveness of these modified LP+FT protocols by decreasing SB on the dominoes dataset, and jointly improving OOD generalization and safety on standard adaptation benchmarks.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Supplementary Material: zip
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Submission Guidelines: Yes
Please Choose The Closest Area That Your Submission Falls Into: Deep Learning and representational learning
19 Replies