AutoFT: Robust Fine-Tuning by Optimizing Hyperparameters on OOD Data

Published: 28 Oct 2023, Last Modified: 02 Apr 2024DistShift 2023 PosterEveryoneRevisionsBibTeX
Keywords: robust fine-tuning, foundation models, adaptation, few-shot learning, meta-learning, hyperparameter optimization
TL;DR: Robust fine-tuning via hyperparameter optimization on OOD validation data
Abstract: Foundation models encode a rich representation that can be adapted to a desired task by fine-tuning on task-specific data. However, fine-tuning a model on one particular data distribution often compromises the model's original performance on other distributions. Current methods for robust fine-tuning utilize various hand-crafted regularization techniques to constrain the fine-tuning process towards the base foundation model. Yet, it is hard to directly specify what characteristics of the foundation model to retain during fine-tuning, as this is influenced by the complex interplay between the pre-training, fine-tuning, and evaluation distributions. We propose AutoFT, a data-driven method for guiding foundation model adaptation: optimizing hyperparameters for fine-tuning with respect to post-adaptation performance on a small out-of-distribution (OOD) validation set. We find that when optimizing hyperparameters for OOD generalization, it is especially beneficial to use a highly expressive hyperparameter space such as per-layer learning rates and loss weight coefficients. Our evaluation demonstrates state-of-the-art performance on OOD distributions unseen during fine-tuning and hyperparameter optimization.
Submission Number: 104