# Memory Efficient Federated Domain Adaptation
The official implementation of our paper: "Source-Target Unified Knowledge Distillation for Memory-Efficient Federated Domain Adaptation on Edge Devices".


## Files
```
├── data_preprocessing
│   ├── data_loading.py
│   └── __init__.py
├── experiments
│   ├── collaborate_DA_completed.py
│   ├── finetune_compact.py
│   ├── train_large_src.py
│   └── train_large_tar.py
├── fedml_core
├── fedxdd
├── model
│   ├── __init__.py
│   ├── modules.py
│   └── network.py
├── pretrain_compact.sh
├── pretrain_teacher.sh
├── README.md
├── run_collaborate.sh
├── requirements.txt
└── utils
    ├── common_utils.py
    ├── __init__.py
    ├── loss.py
    └── memory_cost_profiler.py
```
## Requirements
The code is implemented under support of Pytorch 1.7.0 and a set of Python packages.

To install Pytorch 1.7.0,
```{shell}
conda install pytorch==1.7.0 torchvision==0.8.0 torchaudio==0.7.0 cudatoolkit=11.0 -c pytorch
```

To install supporting packages,
```{shell}
pip install -r requirements.txt
```

## Dataset Preparation
Download the following ```${dataset}``` to ```${dataset_path}```:
- ```office``` (Office-31)
    - Link: https://www.jianguoyun.com/p/Dblj5GcQmN7PCBiA9asD (Password: FcaDrw)
    - 3 domains: amazon, dslr, webcam; 
    - 31 classes.
- ```office-home``` (Office-Home)
    - Link: http://hemanthdv.org/OfficeHome-Dataset/ 
    - 4 domains: Art, Clipart, Product, RealWorld; 
    - 65 classes.
- ```office-caltech``` (Office-Caltech10)
    - Link: https://pan.baidu.com/s/14JEGQ56LJX7LMbd6GLtxCw
    - 4 domains: amazon, caltech, dslr, webcam; 
    - 10 classes.
- ```imageCLEF``` (ImageCLEF)
    - Link: https://pan.baidu.com/s/1-06SNBiG1sfwVfPwrscwGw (Password: xftq)
    - 4 domains: C, I, P; 
    - 12 classes.

Alternative dataset download links could be found here: https://github.com/jindongwang/transferlearning/tree/master/data

## Usages
Pretrain the compact student model on source domains:
```{shell}
bash pretrain_compact.sh ${dataset} ${dataset_path} (${gpu_id})
```

Pretrain the large teacher model on source domains first, and then on target domains:
```{shell}
bash pretrain_teacher.sh ${dataset} ${dataset_path} (${gpu_id})
```

After pretraining of teacher and student models are completed, run collaborated domain adaptation:
```{shell}
bash run_collaborate.sh ${dataset} ${dataset_path} (${gpu_id})
```

## References
```
@Misc{transferlearning.xyz,
howpublished = {\url{http://transferlearning.xyz}},   
title = {Everything about Transfer Learning and Domain Adapation},  
author = {Wang, Jindong and others}  
}  

@inproceedings{liang2020shot,
    title={Do We Really Need to Access the Source Data? Source Hypothesis Transfer for Unsupervised Domain Adaptation},
    author={Liang, Jian and Hu, Dapeng and Feng, Jiashi},
    booktitle={International Conference on Machine Learning (ICML)},
    pages={6028--6039},
    month = {July 13--18},
    year={2020}
}

@inproceedings{
cai2020tinytl,
title={TinyTL: Reduce Memory, Not Parameters for Efficient On-Device Learning},
author={Cai, Han and Gan, Chuang and Zhu, Ligeng and Han, Song},
booktitle={Advances in Neural Information Processing Systems},
volume={33},
year={2020}
} 

@article{chaoyanghe2020fedml,
  Author = {He, Chaoyang and Li, Songze and So, Jinhyun and Zhang, Mi and Wang, Hongyi and Wang, Xiaoyang and Vepakomma, Praneeth and Singh, Abhishek and Qiu, Hang and Shen, Li and Zhao, Peilin and Kang, Yan and Liu, Yang and Raskar, Ramesh and Yang, Qiang and Annavaram, Murali and Avestimehr, Salman},
  Journal = {arXiv preprint arXiv:2007.13518},
  Title = {FedML: A Research Library and Benchmark for Federated Machine Learning},
  Year = {2020}
}
```
