Towards Better Understanding of Domain Shift on Linear-Probed Visual Foundation Models

Published: 27 Oct 2023, Last Modified: 24 Apr 2024ICBINB 2023EveryoneRevisionsBibTeX
Keywords: Computer vision, foundation models, domain shift, generalization, domain transfer, empirical evaluation
TL;DR: An empirical evaluation of popular visual foundation models showcasing their strengths and weaknesses to generalize and transfer in the presence of domain shifts.
Abstract: Visual foundation models have recently emerged to offer similar promise as their language counterparts: The ability to produce representations of visual data that can be successfully used in a variety of tasks and contexts. One common way this is shown in research literature is through “domain generalization” experiments of linear models trained from representations produced by foundation models (i.e. linear probes). These experiments largely limit themselves to a small number of benchmark data sets and report accuracy as the single figure of merit, but give little insight beyond these numbers as to how different foundation models represent shifts. In this work we perform an empirical evaluation that expands the scope of previously reported results in order to give better understanding into how domain shifts are modeled. Namely, we investigate not just how models generalize across domains, but how models may enable domain transfer. Our evaluation spans a number of recent visual foundation models and benchmarks. We find that not only do linear probes fail to generalize on some shift benchmarks, but linear probes trained on some shifted data achieve low train accuracy, indicating that accurate transfer of linear probes is not possible with some visual foundation models.
Submission Number: 25
Loading