Keywords: Domain generalization, Vision Language Model, CLIP, Low-rank Adaptation
Abstract: Domain Generalization (DG) aims to learn a model from multiple source domains
to achieve satisfactory performance on unseen target domains. Recent works
introduce CLIP to DG tasks due to its superior image-text alignment and zero-shot performance. Previous methods either utilize full fine-tuning or prompt learning paradigms to harness CLIP for DG tasks. Those works focus on avoiding
catastrophic forgetting of the original knowledge encoded in CLIP but ignore that
the knowledge encoded in CLIP in nature may contain domain-specific cues that
constrain its domain generalization performance. In this paper, we propose a new
perspective to harness CLIP for DG, i.e., attention head purification. We observe
that different attention heads may encode different properties of an image and
selecting heads appropriately may yield remarkable performance improvement
across domains. Based on such observations, we purify the attention heads of CLIP
from two levels, including task-level purification and domain-level purification.
For task-level purification, we design head-aware LoRA to make each head more
adapted to the task we considered. For domain-level purification, we perform
head selection via a simple gating strategy. We utilize MMD loss to encourage
masked head features to be more domain-invariant to emphasize more generalizable
properties/heads. During training, we jointly perform task-level purification and
domain-level purification. We conduct experiments on various representative DG
benchmarks. Though simple, extensive experiments demonstrate that our method
performs favorably against previous state-of-the-arts.
Primary Area: transfer learning, meta learning, and lifelong learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 3093
Loading