# In Defense of Softmax Parametrization for Calibrated and Consistent Learning to Defer

## Requirements
- Python 3.6
- numpy 1.14
- PyTorch 1.10.1
- torchvision 0.11.2

## Demo
The demo code is an implementation of the proposed surrogate and probability estimator on CIFAR-100 dataset with a 28-layer WideResNet. The default GPU numbers, class number with experts, batch size on each GPU, epoch number, learning rate, weight decay are 8, 60, 128, 200, 1e-1, and 5e-4, respectively. We can running the demo code by entering the following command:
```bash
torchrun --nproc_per_node=8 demo.py
```
The the statistics shown in the experiments will be printed in the end each epoch. Approximately 8GB of memory is required for each GPU.
