Models Out of Line: A Fourier Lens on Distribution Shift RobustnessDownload PDF

Published: 31 Oct 2022, Last Modified: 03 Jan 2023NeurIPS 2022 AcceptReaders: Everyone
Keywords: OOD robustness, effective robustness, deep neural networks, spectral analysis, CLIP models
Abstract: Improving the accuracy of deep neural networks on out-of-distribution (OOD) data is critical to an acceptance of deep learning in real world applications. It has been observed that accuracies on in-distribution (ID) versus OOD data follow a linear trend and models that outperform this baseline are exceptionally rare (and referred to as ``effectively robust”). Recently, some promising approaches have been developed to improve OOD robustness: model pruning, data augmentation, and ensembling or zero-shot evaluating large pretrained models. However, there still is no clear understanding of the conditions on OOD data and model properties that are required to observe effective robustness. We approach this issue by conducting a comprehensive empirical study of diverse approaches that are known to impact OOD robustness on a broad range of natural and synthetic distribution shifts of CIFAR-10 and ImageNet. In particular, we view the "effective robustness puzzle" through a Fourier lens and ask how spectral properties of both models and OOD data correlate with OOD robustness. We find this Fourier lens offers some insight into why certain robust models, particularly those from the CLIP family, achieve OOD robustness. However, our analysis also makes clear that no known metric is consistently the best explanation of OOD robustness. Thus, to aid future research into the OOD puzzle, we address the gap in publicly-available models with effective robustness by introducing a set of pretrained CIFAR-10 models---$RobustNets$---with varying levels of OOD robustness.
TL;DR: We clarify the state of the OOD robustness puzzle, empirically finding that the surprising robustness of some models (e.g., CLIP) to distribution shifts is sometimes better explained by spectral metrics we introduce than by in-distribution accuracy.
Supplementary Material: pdf
15 Replies