Test-Time Training for Out-of-Distribution Generalization

Yu Sun; Xiaolong Wang; Zhuang Liu; John Miller; Alexei A. Efros; Moritz Hardt

Test-Time Training for Out-of-Distribution Generalization

Yu Sun, Xiaolong Wang, Zhuang Liu, John Miller, Alexei A. Efros, Moritz Hardt

25 Sept 2019 (modified: 22 Jun 2025)ICLR 2020 Conference Blind SubmissionReaders: Everyone

TL;DR: Training on a single test input with self-supervision makes the prediction better on this input when it is out-of-distribution.

Abstract: We introduce a general approach, called test-time training, for improving the performance of predictive models when test and training data come from different distributions. Test-time training turns a single unlabeled test instance into a self-supervised learning problem, on which we update the model parameters before making a prediction on the test sample. We show that this simple idea leads to surprising improvements on diverse image classification benchmarks aimed at evaluating robustness to distribution shifts. Theoretical investigations on a convex model reveal helpful intuitions for when we can expect our approach to help.

Code: https://drive.google.com/open?id=1xw-NylSnEjyHs67TXAptviOsx4YuSuZZ

Keywords: out-of-distribution, distribution shifts

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 3 code implementations](https://www.catalyzex.com/paper/test-time-training-for-out-of-distribution/code)

Original Pdf: pdf

10 Replies

Loading