Perturbating, Tuning, and Collaborating: Harnessing Vision Foundation Models for Single Domain Generalization on Medical Imaging

Chuang LIU, Yichao Cao, Yingying Zhang, Xiu Su, Haogang Zhu

Published: 19 Nov 2024, Last Modified: 12 Nov 2025OpenReview Archive Direct UploadEveryoneCC BY 4.0

Abstract: Single Domain Generalization (SDG) is critical in medical imaging applications. Recently, Vision Foundation Models (VFMs) have spearheaded a trend in AI development due to their robust generalizability and versatility. This work aims to fully explore the generalization capabilities of VFMs alongside the domain-specific expertise of specialized mod- els, thoroughly investigating the boundaries of their respec- tive capabilities, thereby collaboratively addressing SDG chal- lenges within medical imaging. We propose a framework for Collaborative reasoning between Specialized and Universal models for Single Domain Generalization (CollaSU-SDG) in medical imaging. Specifically, we first design a model- aware perturbation injection method from the perspective of single-source domain data, enabling differentiated and adap- tive perturbation injection for two different scales of models. Then, a domain expansion adapter is designed for the VFM to adapt to the augmented single-source domain medical data. Lastly, we introduce an adaptive hierarchical transfer and dy- namic dense prompting method that facilitate collaborative reasoning between the specialized and universal models, elim- inating the need for explicit prompts. Through these designs, CollaSU-SDG fully leverages the strengths of both specialized and universal models, achieving robust out-of-distribution gen- eralization capabilities on single-source domain data. Exper- imental results demonstrate that CollaSU-SDG significantly advances the state-of-the-art performance across a wide range of medical datasets. All the code will be publicly available.