Perturbating, Tuning, and Collaborating: Harnessing Vision Foundation Models for Single Domain Generalization on Medical Imaging
Abstract: Single Domain Generalization (SDG) is critical in medical
imaging applications. Recently, Vision Foundation Models
(VFMs) have spearheaded a trend in AI development due
to their robust generalizability and versatility. This work
aims to fully explore the generalization capabilities of VFMs
alongside the domain-specific expertise of specialized mod-
els, thoroughly investigating the boundaries of their respec-
tive capabilities, thereby collaboratively addressing SDG chal-
lenges within medical imaging. We propose a framework for
Collaborative reasoning between Specialized and Universal
models for Single Domain Generalization (CollaSU-SDG)
in medical imaging. Specifically, we first design a model-
aware perturbation injection method from the perspective of
single-source domain data, enabling differentiated and adap-
tive perturbation injection for two different scales of models.
Then, a domain expansion adapter is designed for the VFM
to adapt to the augmented single-source domain medical data.
Lastly, we introduce an adaptive hierarchical transfer and dy-
namic dense prompting method that facilitate collaborative
reasoning between the specialized and universal models, elim-
inating the need for explicit prompts. Through these designs,
CollaSU-SDG fully leverages the strengths of both specialized
and universal models, achieving robust out-of-distribution gen-
eralization capabilities on single-source domain data. Exper-
imental results demonstrate that CollaSU-SDG significantly
advances the state-of-the-art performance across a wide range
of medical datasets. All the code will be publicly available.
Loading