# Self-training for Large Vision Language Models


## Install
Install Package
```Shell
conda create -n stic python=3.10 -y
conda activate stic
pip install --upgrade pip  
pip install -e .
pip install -e ".[train]"
pip install flash-attn --no-build-isolation
pip install trl
```

## Instruction
1. Modify tlr library to make DPO library suitable for VLLMs. Replace `dpo_trainer.py` with `tools/dpo_trainer.py`.
```Shell
cd /home/username/miniconda3/envs/stic/lib/python3.10/site-packages/trl/trainer
```

2. Run the shell script
```Shell
bash scripts/dpo_finetune.sh
```


## Acknowledgement

- [LLaVA](https://github.com/haotian-liu/LLaVA)
- [POVID](https://github.com/YiyangZhou/POVID)
