# Supplementary Material: Code for "DomED: Redesigning Ensemble Distillation for Domain Generalization"

This document provides instructions for setting up the environment and running the code associated with our paper, "DomED: Redesigning Ensemble Distillation for Domain Generalization".

The codebase is built upon the [DomainBed](https://github.com/facebookresearch/DomainBed) framework.

## 1. Code Structure

The contents of this supplementary material are organized as follows:

-   `README.md`: This instruction file.
-   `environment.yml`: Conda environment specification file for creating a reproducible environment.
-   `train_all_t.py`: Main script for training the teacher models (Stage 1).
-   `train_all.py`: Main script for training the DomED student model (Stage 2).
-   `domainbed/`: Directory containing the core framework and algorithm implementations.
-   `config.yaml`: Configuration file for hyperparameters.
-   cuda0.sh: Training command. 

## 2. Environment Setup

We recommend using Conda to set up the required environment, as it ensures all dependencies, including specific CUDA versions, are correctly installed.

1.  **Create the Conda Environment:**
    Navigate to the root directory of this unzipped folder and run the following command. It will create a new conda environment named `domed-env` using the provided `environment.yml` file.
    ```bash
    conda env create -f environment.yml
    ```

2.  **Activate the Environment:**
    Once the environment is successfully created, activate it before running any scripts.
    ```bash
    conda activate domed-env
    ```

## 3. Dataset Preparation

This code does not include the datasets. Please download the standard domain generalization datasets (e.g., PACS, OfficeHome, etc.) from their official sources.

The datasets should be organized in a root directory, which we will refer to as `<PATH_TO_DATASETS>`. The directory structure must be compatible with the DomainBed framework. For example, for the PACS dataset:
```
<PATH_TO_DATASETS>/
└── PACS/
    ├── art_painting/
    ├── cartoon/
    ├── photo/
    └── sketch/
```

## 4. How to Run the Experiments

The training process consists of two main stages. Below are example commands for the **PACS** dataset. Please replace `/path/to/your/datasets/` with the actual path to your downloaded datasets.

### Stage 1: Train Domain-Specific Teacher Models

This stage trains the ERM teacher models, one for each source domain.

**Command:**
```bash
python train_all_t.py ERM_teachers_pacs \
    --dataset PACS \
    --data_dir1 /path/to/your/datasets/ \
    --data_dir2 ./teachers/ \
    --device cuda:0 \
    --seed 0 \
    --lr 5e-5 \
    --algorithm ERM
```
-   **Output:** The trained teacher models will be saved in a new directory named `teachers/PACS/` inside the current project folder.

---

### Stage 2: Train the Student Model with DomED

This stage trains the final student model by distilling knowledge from the pre-trained teachers.

**Command:**
```bash
python train_all.py DomED_student_pacs \
    --dataset DomED_PACS \
    --data_dir1 /path/to/your/datasets/ \
    --data_dir2 ./teachers/PACS/2000 \
    --device cuda:1 \
    --seed 0 \
    --lr 5e-5 \
    --algorithm DomED
```
-   **Input:** This command loads the teacher models from the `./teachers/PACS/2000` directory.
-   **Note:** The `2000` in the `--data_dir2` path refers to the training step of the teacher models to load. You may need to adjust this value based on your teacher training configuration.

## 5. Final Remarks

By following these steps, the key results presented in our paper should be reproducible. Thank you for your time and for reviewing our work.