Label Distribution Shift-Aware Prediction Refinement for Test-Time Adaptation

Minguk Jang; Hye Won Chung

Label Distribution Shift-Aware Prediction Refinement for Test-Time Adaptation

Minguk Jang, Hye Won Chung

Published: 07 Feb 2025, Last Modified: 07 Feb 2025Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Test-time adaptation (TTA) is an effective approach to mitigate performance degradation of trained models when encountering input distribution shifts at test time. However, existing TTA methods often suffer significant performance drops when facing additional class distribution shifts. We first analyze TTA methods under label distribution shifts and identify the presence of class-wise confusion patterns commonly observed across different covariate shifts. Based on this observation, we introduce label Distribution shift-Aware prediction Refinement for Test-time adaptation (DART), a novel TTA method that refines the predictions by focusing on class-wise confusion patterns. DART trains a prediction refinement module during an intermediate time by exposing it to several batches with diverse class distributions using the training dataset. This module is then used during test time to detect and correct class distribution shifts, significantly improving pseudo-label accuracy for test data. Our method exhibits 5-18% gains in accuracy under label distribution shifts on CIFAR-10C, without any performance degradation when there is no label distribution shift. Extensive experiments on CIFAR, PACS, OfficeHome, and ImageNet benchmarks demonstrate DART's ability to correct inaccurate predictions caused by test-time distribution shifts. This improvement leads to enhanced performance in existing TTA methods, making DART a valuable plug-in tool.

Submission Length: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: We upload our revised paper with the following modifications: * (Section 6.1) Discussions have been added regarding its performance in large-scale benchmarks and the utilization of training data during the intermediate time of DART.

Supplementary Material: zip

Assigned Action Editor: ~Sungwoong_Kim2

Submission Number: 3540

Loading