Keywords: robust fine-tuning, foundation models, adaptation, few-shot learning, meta-learning, hyperparameter optimization
TL;DR: Robust fine-tuning via hyperparameter optimization on OOD validation data
Abstract: Foundation models encode a rich representation that can be adapted to a desired task by fine-tuning on task-specific data.
However, fine-tuning a model on one particular data distribution often compromises the model's original performance on other distributions.
Current methods for robust fine-tuning utilize various hand-crafted regularization techniques to constrain the fine-tuning process towards the base foundation model.
Yet, it is hard to directly specify what characteristics of the foundation model to retain during fine-tuning, as this is influenced by the complex interplay between the pre-training, fine-tuning, and evaluation distributions.
We propose AutoFT, a data-driven method for guiding foundation model adaptation: optimizing hyperparameters for fine-tuning with respect to post-adaptation performance on a small out-of-distribution (OOD) validation set.
We find that when optimizing hyperparameters for OOD generalization, it is especially beneficial to use a highly expressive hyperparameter space such as per-layer learning rates and loss weight coefficients.
Our evaluation demonstrates state-of-the-art performance on OOD distributions unseen during fine-tuning and hyperparameter optimization.
Submission Number: 104
Loading