# Lightweight Vision Transformer with Bidirectional Interaction

## Requirements

- Linux with Python ≥ 3.6
- PyTorch >= 1.8.1
- timm >= 0.3.2
- CUDA 11.3


### Conda environment setup

**Note**: Our environmemt is the same with [LITv2](https://github.com/ziplab/LITv2)

```bash
conda create -n fat python=3.7
conda activate fat
#Install Pytorch and TorchVision
pip install torch==1.8.1+cu113 torchvision==0.9.1+cu113 torchaudio==0.8.1 -f https://download.pytorch.org/whl/torch_stable.html

pip install timm
pip install ninja
pip install tensorboard

# Install NVIDIA apex
git clone https://github.com/NVIDIA/apex
cd apex
pip install -v --disable-pip-version-check --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" ./
cd ../
rm -rf apex/

pip install opencv-python==4.4.0.46 termcolor==1.1.0 yacs==0.1.8

gh repo clone DingXiaoH/RepLKNet-pytorch
unzip cutlass.zip
cd examples/19_large_depthwise_conv2d_torch_extension
./setup.py install --user
export LARGE_KERNEL_CONV_IMPL=WHERE_YOU_CLONED_CUTLASS/examples/19_large_depthwise_conv2d_torch_extension
```
more details can be found in ```classification```, ```detection``` and ```segmentation```

## Acknowledgement

Our code is built upon [DeiT](https://github.com/facebookresearch/deit), [Swin](https://github.com/microsoft/Swin-Transformer), [LIT](https://github.com/ziplab/LIT), [LITv2](https://github.com/ziplab/LITv2) and [RepLKNet](https://github.com/MegEngine/RepLKNet), we thank the authors for their open-sourced code.