Keywords: Test-Time Adaptation; Vision-Language Models; Semantic Regularization; Distribution Shift
TL;DR: We propose a test-time adaptation method for CLIP that models per-class semantic drift and cross-class structural relations from pseudo-labeled samples, achieving lightweight, unsupervised visual-language adaptation without training-time access.
Abstract: Test-time adaptation (TTA) aims to improve model robustness under distribution shift by exploiting unlabeled test data. Existing methods often rely on pseudo-labels, which are noisy and treated independently, ignoring both their temporal reliability and the semantic structure of the label space. We introduce SURE (Semantic Uncertainty REgularization), a framework that regularizes predictions through a dynamically evolving prototype-reliability graph (PRG). PRG captures semantic affinity across classes and the stability of confidence over time, enabling the selective propagation of reliable predictions while suppressing errors. This structure-driven regularization enforces semantic consistency and prevents error amplification. Across diverse domain-shift benchmarks, SURE consistently outperforms prior methods, offering a principled and generalizable approach to reliable TTA.
Supplementary Material: zip
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Submission Number: 7103
Loading