## Required packages
Python >= 3.7, Pytorch >= 1.5 (1.4 might also work).
More details in `requirements.txt`. 
Note that hydra needs to be of version 1.0 (i.e. 1.0.0.rc1 as of 2020-06-24).

We use
[Pytorch-lightning](https://github.com/PyTorchLightning/pytorch-lightning) to
organize training code, [Hydra](https://hydra.cc/) to organize configurations.

[Optional] We use [Ray](https://github.com/ray-project/ray) for distributed
training.

[Optional] We use [Wandb](https://wandb.ai/) for logging.

## Code structure
```
├─ cfg               # Configuration files, for Hydra
├─ datasets          # Code for datasets
├─ models            # Model implementations
├─ scripts           # Scripts to tune hyperparameters
├─ distill_train.py  # Train student model, distilled from teacher model
├─ kd.py       # Implementation of knowledge distillation losses
├─ ray_utils.py      # Distributed training with Ray [optional]
├─ train.py          # Train a model (e.g. teacher or student) from scratch
├─ utils.py          # Utility functions
├─ wandblogger.py    # Logger for Wandb that works with Ray [optional]
```

## Training
To train a ResNet18 on CIFAR10, and save model to disk:
```
python train.py train.batch_size=512 train.optimizer.lr=4e-1 model=resnet18 +save_checkpoint_path=checkpoints/resnet18/final.ckpt ray.local=True
```

To train a CNN5 student:
```
python distill_train.py train.batch_size=512 train.optimizer.lr=4e-1 train.kd.class=KDOrthoLoss train.gradient_clip_val=0.1 train.kd.temperature=2.0 ray.local=True
```

