## Usage
### Cifar10
For training task on _Cifar10_, to reproduce our results, you can run training with
```bash
export CUDA_VISIBLE_DEVICES=0
python image_classification/main.py [--optimizer] [--depth] [--lr] [--eps]
```
More optional arguments can be found by running
```bash
python image_classification/main.py -h
```
To reproduce our results, please use the following arguments for different optimizer:
```
SGD:
--optimizer sgd --lr 0.1

Adam:
--optimizer adam --lr 0.001

Adamw:
--optimizer adamw --lr 0.005

AdaBelief:
--optimizer adabelief --lr 0.001

AdaHessian:
--optimizer adahessian --lr 0.15

AdaDQH:
--optimizer adadqh --lr 0.007 --eps 1e-2 --weight-decouple
```
We also give the scripts in directory `config`.

### ImageNet
Before training on _ImageNet_, you have to generate the WebDataset version of Imagenet by yourself.
The script `makeshards.py` will do this for you.
If you have _ImageNet_ dataset in `/data/imagenet`, you can run the following code to transform that data into shards:
```bash
python imagenet/makeshards.py --data /data/imagenet --splits train --shards ./imagenet/shards # for training set
python imagenet/makeshards.py --data /data/imagenet --splits val --shards ./imagenet/val_shards # for validation set
```
For training on _ImageNet_, an example is given here:
```bash
python main-wds-v2.py --data ./imagenet --trainshards './imagenet/shards/imagenet-train-{000000..000146}.tar' --valshards './imagenet/val_shards/imagenet-val-{000000..000022}.tar' \
--dist-url tcp://127.0.0.1:9153 --dist-backend nccl --multiprocessing-distributed --world-size 1 --rank 0 \
--workers 8 --print-freq 500 --lr 0.0004 --eps 1e-5 --optimizer adadqh
```
The arguments we used for different optimizers are given here:
```
SGD:
--optimizer sgd --lr 0.1

Adam:
--optimizer adam --lr 0.001

Adamw:
--optimizer adamw --lr 0.005

AdaBelief:
--optimizer adabelief --lr 0.001

AdaHessian:
--optimizer adahessian --lr 0.0001

AdaDQH:
--optimizer adadqh --lr 0.0004 --eps 1e-5
```
We also give the scripts in directory `config`.

## Credit
- The code for training on _Cifar10_ are modified from [AdaHessian](https://github.com/amirgholami/adahessian)
- The code for training on _imageNet_ are modified from [PyTorch Imagenet Example](https://github.com/tmbdev-archive/webdataset-examples)
