What Variables Affect Out-of-Distribution Generalization in Pretrained Models?

Md Yousuf Harun; Kyungbok Lee; Jhair Gallardo; Giri Prashanth; Christopher Kanan

What Variables Affect Out-of-Distribution Generalization in Pretrained Models?

Md Yousuf Harun, Kyungbok Lee, Jhair Gallardo, Giri Prashanth, Christopher Kanan

Published: 25 Sept 2024, Last Modified: 06 Nov 2024NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Image Embeddings, Out-of-Distribution Generalization, Tunnel Effect, Neural Collapse

TL;DR: We identify what variables matter most in out-of-distribution generalization of embeddings and we show that the tunnel effect hypothesis proposed in NeurIPS-2023 is not universal.

Abstract: Embeddings produced by pre-trained deep neural networks (DNNs) are widely used; however, their efficacy for downstream tasks can vary widely. We study the factors influencing transferability and out-of-distribution (OOD) generalization of pre-trained DNN embeddings through the lens of the tunnel effect hypothesis, which is closely related to intermediate neural collapse. This hypothesis suggests that deeper DNN layers compress representations and hinder OOD generalization. Contrary to earlier work, our experiments show this is not a universal phenomenon. We comprehensively investigate the impact of DNN architecture, training data, image resolution, and augmentations on transferability. We identify that training with high-resolution datasets containing many classes greatly reduces representation compression and improves transferability. Our results emphasize the danger of generalizing findings from toy datasets to broader contexts.

Supplementary Material: zip

Primary Area: Evaluation (methodology, meta studies, replicability and validity)

Submission Number: 594

Loading