Realistic Evaluation of Test-Time Adaptation: Surrogate-Based Model Selection Strategies

15 Sept 2023 (modified: 25 Mar 2024)ICLR 2024 Conference Withdrawn SubmissionEveryoneRevisionsBibTeX
Keywords: test-time adaptation, model selection
Abstract: Test-Time Adaptation (TTA) has recently emerged as a promising strategy for tackling the problem of machine learning model robustness under distribution shifts. This setting constitutes a significant challenge as the model has to adapt to the new environment without any labeled data. Contemporary methods, such as neural networks, typically rely on a cumbersome hyper-parameter tuning procedure that leverages target labels, yet what happens when those labels are unavailable, as in the test-time adaptation scenario? In this work, we tackle this very problem of hyperparameter selection by evaluating several surrogate metrics (without any access to the test labels). The main goal of this work is to provide a realistic evaluation of TTA methods under different domain shifts, as well as evaluation of different strategies for model selection in TTA. Our main findings are: i) the accuracy of model selection strategies strongly varies across datasets and adaptation methods; ii) out of 6 evaluated approaches, only the AdaContrast method allows for surrogate-based model selection that matches oracle selection performance and iii) using a tiny-set of labeled test samples beats all competing selection strategies. Our findings underscore the need for future research in the field to conduct rigorous evaluations with explicitly stated model selection strategies, to give more realistic approximations of test-time adaptation methods performance.
Supplementary Material: zip
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 17
Loading