
# START: A Generalized State Space Model with Saliency-Driven Token-Aware Transformation



## Envs. for training

- Python 3.10.13

  - `conda create -n your_env_name python=3.10.13`

- torch 2.1.1 + cu118
  - `pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118`

- Requirements: vim_requirements.txt
  - `pip install -r vim/vim_requirements.txt`

- Install ``causal_conv1d`` and ``mamba``
  - `pip install -e causal_conv1d>=1.1.0`
  - `pip install -e mamba-1p1p1`
 
## DataSets
Please download PACS dataset from [here](https://drive.google.com/drive/folders/0B6x7gtvErXgfUU1WcGY5SzdwZVk?resourcekey=0-2fvpQY_QSyJf2uIECzqPuQ).
Make sure you use the official train/val/test split in [PACS paper](https://openaccess.thecvf.com/content_iccv_2017/html/Li_Deeper_Broader_and_ICCV_2017_paper.html).
Take `/data/DataSets/` as the saved directory for example:
```
images -> /data/DataSets/PACS/kfold/art_painting/dog/pic_001.jpg, ...
splits -> /data/DataSets/PACS/pacs_label/art_painting_crossval_kfold.txt, ...
```
Then set the `"data_root"` as `"/data/DataSets/"` and `"data"` as `"PACS"` in  `"main_dg.py"`. 

You can directly set the `"data_root"` and `"data"` in `"ft-vmamba-t.sh"` for training the model.
  

## Training 


Firstly download the VMamba-T model pretrained on ImageNet from [here](https://github.com/MzeroMiko/VMamba/releases/download/%2320240218/vssmtiny_dp01_ckpt_epoch_292.pth) and save it to `/pretrained_model`. To run START-M, you could run the following code. Please set the `--data_root` argument needs to be changed according to your folder. 

```
base scripts/START-M.sh
```

You can also train the START-X model by running the following code:

```
base scripts/START-X.sh
```




