# SLIM-QN
A stochastic light momentumized quasi-Newton optimizer for large-scale model training.

## Requirements
The code has been tested with
- Python 3.6.9
- CUDA 11.0
- PyTorch 1.8.1
- PyTorch Lightning 1.2.5

## Dirs
```
./
|
|-train.py: top main function
|
|-*.sh: script for training models
|
|-model: pre-define models
|
|-opt: optimizer implementation
|
|-plot: python script for visualize training and results
|
|-utils: some utils functions
```

## Training Script
```bash
sh *.sh #gpus #address #port
```
### CIFAR-10
- SGD
```bash
sh run-cifar10-SGD.sh 1 127.0.0.1 11111
```
- SLIM-QN
```bash
sh run-cifar10-LBFGS.sh 1 127.0.0.1 11111
```
### ImageNet
- SGD
```bash
sh run-imagenet-SGD.sh 8 127.0.0.1 11111
```
- KFAC
```bash
sh run-imagenet-KFAC.sh 8 127.0.0.1 11111
```
- SLIM-QN
```bash
sh run-imagenet-LBFGS.sh 8 127.0.0.1 11111
```

## Results
- CIFAR-10
<figure>
<img src="plot/cifar10_acc1.pdf" alt="Trulli" style="width:80%">
<figcaption align = "center"><b>Fig.1 - Convergence on CIFAR10 using SGD and SLIM</b></figcaption>
</figure>

- ImageNet
<figure>
<img src="plot/imagenet_acc1_iter.pdf" alt="Trulli" style="width:80%">
<figcaption align = "center"><b>Fig.1 - Convergence on ImageNet using SGD, KFAC and SLIM</b></figcaption>
</figure>