# Balancing Domain-Invariant and Domain-Specific Knowledge for Domain Generalization with Online Knowledge Distillation

Deep learning models often experience performance degradation when the distribution of testing data differs from that of training data.
Domain generalization addresses this problem by leveraging knowledge from multiple source domains to enhance model generalizability.
Recent studies have shown that distilling knowledge from large pretrained models effectively improves a model's ability to generalize to unseen domains.
However, current knowledge distillation-based domain generalization approaches overlook the importance of domain-specific knowledge and rely on a two-stage training process, which limits the effectiveness of knowledge transfer.
To overcome these limitations, we propose the Balanced Online knowLedge Distillation (BOLD) framework for domain generalization.
BOLD employs a multi-domain expert teacher model, with each expert specializing in specific source domains to preserve domain-specific knowledge.
This approach enables the student to distil both domain-invariant and domain-specific knowledge from the teacher.
Additionally, BOLD adopts an online knowledge distillation strategy where the teacher and students learn simultaneously, allowing the teacher to adapt based on the student's feedback, thereby enhancing knowledge transfer and improving the student's generalizability.
Extensive experiments conducted with state-of-the-art baselines on seven domain generalization benchmarks demonstrate the effectiveness of the BOLD framework. 
We also provide a theoretical analysis that underscores the effectiveness of domain-specific knowledge and the online knowledge distillation strategy in domain generalization.

# Baseline
* ERM
* 2018 - CrossGrad - [Generalizing Across Domains via Cross-Gradient Training](https://openreview.net/forum?id=r1Dx7fbCW)
* 2020 - DDAIG - [Deep Domain-Adversarial Image Generation for Domain Generalisation](https://arxiv.org/abs/2003.06054)
* 2021 - MixStyle - [Domain Generalization with MixStyle](https://openreview.net/forum?id=6xHJ37MVxxp)
* 2021 - NKD - [Embracing the Dark Knowledge: Domain Generalization Using Regularized Knowledge Distillation](https://dl.acm.org/doi/abs/10.1145/3474085.3475434)
* 2022 - DomainMix - [Dynamic Domain Generalization](https://arxiv.org/abs/2205.13913)
* 2022 - EFDMix - [Exact Feature Distribution Matching for Arbitrary Style Transfer and Domain Generalization](https://arxiv.org/abs/2203.07740)
* 2023 - RISE - [A Sentence Speaks a Thousand Images: Domain Generalization through Distilling CLIP with Language Guidance](https://openaccess.thecvf.com/content/ICCV2023/html/Huang_A_Sentence_Speaks_a_Thousand_Images_Domain_Generalization_through_Distilling_ICCV_2023_paper.html)
* 2024 - SSPL -  [Symmetric Self-Paced Learning for Domain Generalization](https://ojs.aaai.org/index.php/AAAI/article/view/29639)

# Datasets
* Digits
* PACS
* OfficeHome
* VLCS
* Terra Incognita
* NICO++
* DomainNet

# Sample Command

python train.py

                --gpu 1                                                 # Specify device
                --seed 995                                              # Random Seed
                --output-dir output/BOKD-RN50-NICO-autumn               # Output directory 
                --dataset NICO                                          # Specify dataset
                --source-domains dim grass outdoor rock water           # Source Domains
                --target-domains autumn                                 # Target Domain
                --model BOKD                                            # Model for training
                --model-config-file config/bokd.yaml                    # Config file for model
