# A Separable Self-attention Inspired by State Space Model for Computer Vision

## 

## ImageNet classification

### 1. Requirements

torch>=1.7.0;torchvision>=0.8.0;  pyyaml;  timm==0.6.13;  einops;  fvcore;  h5py;

### 2.Train VMINet

```
python3 -m torch.distributed.launch --nproc_per_node=3 train_imagenet.py --data {path-to-imagenet} --model {VMINet-variants} -b 256 --lr 1e-3 --weight-decay 0.025 --aa rand-m1-mstd0.5-inc1 --cutmix 0.2 --color-jitter 0. --drop-path 0.
```

### 3. Pretrained checkpoint

Due to the 100 MB limit on supplementary materials, we only provide the checkpoint of theVMINet-XS for testing.
