Test-Time Adaptation of Vision-Language Models with Low-Rank Pseudo-Consistency

Test-Time Adaptation of Vision-Language Models with Low-Rank Pseudo-Consistency

TMLR Paper7335 Authors

04 Feb 2026 (modified: 13 May 2026)Decision pending for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: While test-time adaptation (TTA) methods enable vision-language models (VLMs) to adapt under distribution shifts, they typically rely on simple feature transformations following frozen encoders while learning from potentially noisy pseudo-labels. This approach may limit adaptation under significant domain shifts. In this paper, we propose PseudoAdapter, a novel TTA framework for VLMs that introduces low-rank adapters into early layers of the encoder to enable domain-specific feature adaptation while maintaining generalization. To ensure effective learning from noisy and low-confidence predictions, PseudoAdapter combines confidence-calibrated pseudo-labelling with unsupervised consistency learning across augmented views. We further extend our approach with PseudoAdapter+, which integrates selective teacher supervision to improve adaptation with minimal overhead. Extensive evaluations on four out-of-distribution and ten cross-domain benchmarks demonstrate that our method outperforms prior state-of-the-art TTA approaches by an average of 6.84\% and 3.25\%, respectively. Ablation studies confirm the effectiveness of each proposed component.

Submission Type: Regular submission (no more than 12 pages of main content)

Assigned Action Editor: ~Nicolas_THOME2

Submission Number: 7335

Loading