Test-Time Adaptation of Vision-Language Models with Low-Rank Pseudo-Consistency

TMLR Paper7335 Authors

04 Feb 2026 (modified: 09 Feb 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: While test-time adaptation (TTA) methods enable vision-language models (VLMs) to adapt under distribution shifts, they typically rely on simple feature transformations following frozen encoders while learning from potentially noisy pseudo-labels. This approach may limit adaptation under significant domain shifts. In this paper, we propose PseudoAdapter, a novel TTA framework for VLMs that introduces low-rank adapters into early layers of the encoder to enable domain-specific feature adaptation while maintaining generalization. To ensure effective learning from noisy and low-confidence predictions, PseudoAdapter combines confidence-calibrated pseudo-labelling with unsupervised consistency learning across augmented views. We further extend our approach with PseudoAdapter+, which integrates selective teacher supervision to improve adaptation with minimal overhead. Extensive evaluations on four out-of-distribution and ten cross-domain benchmarks demonstrate that our method outperforms prior state-of-the-art TTA approaches by an average of 6.84\% and 3.25\%, respectively. Ablation studies confirm the effectiveness of each proposed component.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Nicolas_THOME2
Submission Number: 7335
Loading