Balancing Domain-Invariant and Domain-Specific Knowledge for Domain Generalization with Online Knowledge Distillation

Di Zhao; Gillian Dobbie; Jingfeng Zhang; Hongsheng Hu; Philippe Fournier-Viger; Yun Sing Koh

Balancing Domain-Invariant and Domain-Specific Knowledge for Domain Generalization with Online Knowledge Distillation

Di Zhao, Gillian Dobbie, Jingfeng Zhang, Hongsheng Hu, Philippe Fournier-Viger, Yun Sing Koh

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Transfer Learning, Domain Generalization, Knowledge Distillation

TL;DR: A novel framework that improves model's generalizability on unseen domains by distilling domain-invariant and domain-specific knowledge from a teacher model through online knowledge distillation.

Abstract: Deep learning models often experience performance degradation when the distribution of testing data differs from that of training data. Domain generalization addresses this problem by leveraging knowledge from multiple source domains to enhance model generalizability. Recent studies have shown that distilling knowledge from large pretrained models effectively improves a model's ability to generalize to unseen domains. However, current knowledge distillation-based domain generalization approaches overlook the importance of domain-specific knowledge and rely on a two-stage training process, which limits the effectiveness of knowledge transfer. To overcome these limitations, we propose the Balanced Online knowLedge Distillation (BOLD) framework for domain generalization. BOLD employs a multi-domain expert teacher model, with each expert specializing in specific source domains to preserve domain-specific knowledge. This approach enables the student to distil both domain-invariant and domain-specific knowledge from the teacher. Additionally, BOLD adopts an online knowledge distillation strategy where the teacher and students learn simultaneously, allowing the teacher to adapt based on the student's feedback, thereby enhancing knowledge transfer and improving the student's generalizability. Extensive experiments conducted with state-of-the-art baselines on seven domain generalization benchmarks demonstrate the effectiveness of the BOLD framework. We also provide a theoretical analysis that underscores the effectiveness of domain-specific knowledge and the online knowledge distillation strategy in domain generalization.

Supplementary Material: zip

Primary Area: transfer learning, meta learning, and lifelong learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 9843

Loading