Connect Later: Improving Fine-Tuning for Robustness with Targeted Augmentations

Helen Qu; Sang Michael Xie

Connect Later: Improving Fine-Tuning for Robustness with Targeted Augmentations

Helen Qu, Sang Michael Xie

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Keywords: pretraining, domain adaptation, robustness

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.

Abstract: Models trained on a labeled source domain (e.g., bright, nearby astronomical objects) often generalize poorly when deployed on an out-of-distribution (OOD) target domain (e.g., faint, distant objects). In the domain adaptation setting where unlabeled target data is available, self-supervised pretraining (e.g., masked autoencoding or contrastive learning) is a promising method to mitigate this performance drop. Pretraining improves OOD error when the generic data augmentations used (e.g., masking or cropping) connect the source and target domains, which may be far apart in the input space. In this paper, we show on real-world tasks that standard fine-tuning after pretraining does not consistently improve OOD error over just supervised learning on labeled source data. To better leverage pretraining for distribution shifts, we propose Connect Later: after pretraining with generic augmentations to learn good representations within the source and target domains, fine-tune with targeted augmentations designed with knowledge of the distribution shift to better connect the domains. Connect Later improves average OOD error over standard fine-tuning and supervised learning with targeted augmentations on 3 real-world datasets: astronomical time-series classification (AstroClassification) by 12%, redshift prediction for astronomical time-series (Redshifts) by 0.03 RMSE (11% relative), and wildlife species identification (iWildCam-WILDS) by 0.9%, achieving the state-of-the-art on AstroClassification and on iWildCam-WILDS with ResNet-50.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.

Supplementary Material: zip

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 6720

Loading