Project with Source, Probe with Target: Extracting Useful Features for Adaptation to Distribution Shifts
Keywords: distribution-shift robustness, fine-tuning, adaptation
TL;DR: We propose Project and Probe, a lightweight, sample-efficient approach that learns a diverse set of predictive features and adapts to a target distribution by interpolating among them with a small target dataset.
Abstract: Conventional approaches to robustness try to learn a model based on causal features. However, identifying maximally robust or causal features may be difficult in some scenarios, and in others, non-causal ``shortcut'' features may actually be more predictive. We propose a lightweight, sample-efficient approach that learns a diverse set of features and adapts to a target distribution by interpolating these features with a small target dataset. Our approach, Project and Probe (Pro^2), first learns a linear projection that maps a pre-trained embedding onto orthogonal directions while being predictive of labels in the source dataset. The goal of this step is to learn a variety of predictive features, so that at least some of them remain useful after distribution shift. Pro^2 then learns a linear classifier on top of these projected features using a small target dataset. We theoretically show that Pro^2 learns a projection matrix that is optimal for classification in an information-theoretic sense, resulting in better generalization due to a favorable bias-variance tradeoff. Our experiments on eight distribution shift settings show that Pro^2 improves performance by 5-15% when given limited target data compared to prior methods such as standard linear probing.
0 Replies
Loading