Parameter-Efficient Attention Transfer for Multi-Modal Test-Time Adaptation

18 Sept 2025 (modified: 14 Nov 2025)ICLR 2026 Conference Withdrawn SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: test-time adaptation, multi-modal learning
Abstract: Test-time adaptation (TTA) has proven effective in enhancing model robustness against unforeseen distribution shifts during testing. However, current TTA methods struggle when applied to multi-modal models. In this paper, we explore multi-modal TTA and reveal two key limitations of existing approaches: i) difficulty in mitigating attention shifts when dealing with biased modalities, and ii) insufficient exploitation of the synergy and complementarity among multiple modalities. To address these challenges, we propose a novel method called **P**arameter-**E**fficient **A**ttention **T**ransfer (**PEAT**), which strikes a balance between performance and efficiency. Specifically, we first discuss the modulation strategies for updating various model parameters and propose to adapt the self-attention modules. Furthermore, we design a modality-aware low-rank adaptation method to dynamically learn cross-domain attention patterns. Our approach introduces intra-modal and inter-modal interactions for LoRAs, where the former captures uni-modal domain information through modality-specific parameters, while the latter promotes cross-modal feature alignment in a unified space through modality-shared parameters. Extensive experiments conducted across various distribution-shifted modalities, including video, image, audio, and text, demonstrate that PEAT consistently outperforms existing state-of-the-art methods.
Primary Area: transfer learning, meta learning, and lifelong learning
Submission Number: 10580
Loading