# DeNAV: Decentralized Self-Supervised Learning with a Training Navigator

The PyTorch implementation for paper "DeNAV: Decentralized Self-Supervised Learning with a Training Navigator" submitted to ICLR conference.

## Environment Installation

1. Install Anaconda from the [Anaconda official website](https://www.anaconda.com/)
2. Run the following commands to install the virtual environment

```
cd distributed_SSL
conda env create -f environment.yml
```
3. Run the following command to switch to the installed virtual environment
```
conda activate dist
```
4. Find the appropriate command from the [PyTorch official website](https://pytorch.org/) to install PyTorch framework to the virtual environment

5. For any issue of missing library, try with the installation command:
```
pip install xxx
```

## Instruction 

### -- Dataset Preparation

Our codes support the following datasets:

0. CIFAR-10 (Supported by PyTorch)
1. CIFAR-100 (Supported by PyTorch)
2. food-101 (Supported by PyTorch)
3. ImageNet (Need to download from [ImageNet official website](https://image-net.org/index.php). How to extract: [link](https://gist.github.com/BIGBALLON/8a71d225eff18d88e469e6ea9b39cef4))
4. Mini-ImageNet (After downloading ImageNet, see repo [Tools for mini-ImageNet Dataset](https://github.com/yaoyao-liu/mini-imagenet-tools#about-mini-ImageNet))
5. RoadSign (Need to download from [Kaggle](https://www.kaggle.com/datasets/sergeykulakin/russian-road-signs-categories-dataset))
6. Mini-INat2021 (Need to download from repo [iNaturalist Competition Datasets](https://github.com/visipedia/inat_comp/tree/master/2021))

For datasets that need to be additionally downloaded, also remember to specify the correct path to data folder in **prepare_dataset.py** after the download.

### -- Pretraining

Example 1 - DeNAV:
* Pretrain with Mini-ImageNet dataset
* Client data follows IID
* Test network (100 clients, 0.15 network connectivity, 1-5 scale of computing resources)
* Default training settings
```
python main.py -p pretrain -d 4 -samp iid -nw 100 -c 0.15 -rd 200 -tt 3 -le 5 -nm 1 -sp ./checkpoint/ -gp ./graph/network_G.adjlist -dsp ./graph/data_split_iid.pkl -cpp ./graph/computing_power.pkl -swp ./graph/starting_workers.pkl -ri 100 -m 2 -agg 3 -bl 0 -bd 1 --seed 0
```

Example 2 - DeNAV with parallel training:
* Pretrain with ImageNet dataset
* Client data follows Non-IID with Dirichet Parameter 0.1
* Same network settings as example 1
* Default parallel training settings (5 training clients per pre-training step)
```
python main.py -p pretrain -d 3 -samp dir --alpha 0.1 -nw 100 -c 0.15 -rd 200 -tt 3 -le 5 -nm 5 -sp ./checkpoint2/ -gp ./graph/network_G.adjlist -dsp ./graph/data_split_noniid.pkl -cpp ./graph/computing_power.pkl -swp ./graph/starting_workers.pkl -ri 100 -m 2 -agg 3 -bl 0 -sb 5 -bd 1 --seed 1
```

Example 3 - Gossip Baseline:
* Pretrain with Mini-ImageNet dataset
* Client data follows Non-IID with Dirichet Parameter 0.1
* Same network settings as example 1
* Training for 50 rounds, 5 local epoch
```
python main.py -p pretrain -d 4 -samp dir --alpha 0.1 -nw 100 -c 0.15 -rd 50 -le 5  -sp ./checkpoint3/ -gp ./graph/network_G.adjlist -dsp ./graph/data_split_noniid.pkl -cpp ./graph/computing_power.pkl -ri 10 -agg 3 -bl 2 -bd 1 --seed 2
```


### -- Finetuning (Requires pretrained checkpoint)

Example 1:
* Pretraining of Example 1 has completed
* Finetuning with CIFAR-10 dataset
* Use 100% labeled data
* Finetuning the encoder with 5 replicated blocks
```
python main.py -p finetune -d 0 -ra 1 -sp ./checkpoint/ -ftd 5 -bl 0 --seed 3
```

Example 2:
* Pretraining of Example 2 has completed
* Finetuning with CIFAR-100 dataset
* Use 10% labeled data
* Finetuning the encoder with 7 replicated blocks
```
python main.py -p finetune -d 1 -ra 0.1 -sp ./checkpoint2/ -ftd 7 -bl 0 --seed 4
```


## File Structure

```
├── util/ <codes under this directory are taken from MAE repo>
├── baseline_models.py <codes for FSSL baselines>
├── distributed.py <includes codes for network initialization, the client selection, model aggregation and model cascading>
├── engine_finetune.py <the training engine for finetuning>
├── engine_pretrain.py <the training engine for pretraining>
├── environment.yml <information about the conda environment>
├── evaluation.py <main program of finetuning>
├── main.py <file for start running>
├── model_mae.py <implementation of MAE model>
├── model_ViT.py <implementation of ViT model>
├── prepare_dataset.py <codes for loading datsets, preparing dataloaders and data division>
├── readme.md <ReadMe file>
├── train_worker.py <main program of client pretraining>
```

# Reference

This code is implemented based on the repository [Masked Autoencoders: A PyTorch Implementation](https://github.com/facebookresearch/deit), which is the PyTorch implementation of paper [Masked Autoencoders Are Scalable Vision Learners](https://arxiv.org/abs/2111.06377):
```
@Article{MaskedAutoencoders2021,
  author  = {Kaiming He and Xinlei Chen and Saining Xie and Yanghao Li and Piotr Doll{\'a}r and Ross Girshick},
  journal = {arXiv:2111.06377},
  title   = {Masked Autoencoders Are Scalable Vision Learners},
  year    = {2021},
}
```

In the MAE repo, it also has references to the following projects:
 * [DeiT repo](https://github.com/facebookresearch/deit)
 * [timm](https://github.com/rwightman/pytorch-image-models)
 * [ELECTRA](https://github.com/google-research/electra)
 * [BEiT](https://github.com/microsoft/unilm/tree/master/beit)
 * [MoCo v3](https://github.com/facebookresearch/moco-v3)
 * [Transformer](https://github.com/tensorflow/models/blob/master/official/nlp/transformer/model_utils.py)


