
## Installation
The code base is mostly built upon the public [LAVIS](https://github.com/salesforce/LAVIS) project, please refer to the installation [guidance](https://github.com/salesforce/LAVIS#installation) for environment setup.

## Q-Former Weights
In the proposed tune-cross-evaluation paradigm, we perform instruction tuning mainly using the Q-Former as the projection module. The weights of the Q-Former that has been pre-trained in BLIP2, with Vicuna7B as the language model, before instruction tuning has been released in LAVIS, and can be downloaded [here](https://storage.googleapis.com/sfr-vision-language-research/LAVIS/models/BLIP2/blip2_pretrained_vicuna7b.pth).

After getting the weights of the Q-Former pre-trained in the first stage, it is hard-coded specified at the end of ``lavis/models/blip2_models/blip2_vicuna_instruct.py`` for loading.

## REVO-LION Data Description
We include the annotation as json files in ``REVO_LION_JSON``. ``All_tune.json`` is the annotation file of REVO-LION-Tune, ``All_eval.json`` is the annotation file of REVO-LION-Eval. It is noted that the samples in the two files are simplified as image-instruction-answer triplets for convenience. Therefore, some samples such as conversation may be transferred into multiple instances.

## Instruction Tuning and Evaluation on REVO-LION

### Dataset
Specifically, we define the dataset class ``RefineDataset`` in ``lavis/datasets/datasets/instruction_datasets.py`` and its builder in ``lavis/datasets/builders/instruction_builder.py``. On this basis, the path of annotation files of REVO-LION-Tune and REVO-LION-Eval are defined in ``lavis/configs/datasets/instruction/defaults_refine_all_tune.yaml`` and ``lavis/configs/datasets/instruction/defaults_refine_all_eval.yaml``, respectively.

### Model
The model configurations and hyperparameters of instruction tuning are specified in ``lavis/projects/instruct_blip/train/instruct_blip_vicuna7b_Refine_Benchmark.yaml``.
To load the weights of the Q-Former after pre-trained in the first stage, we design the argument ``train_replace`` and set it as True for instruction tuning. 

### Run Scripts
After installing the environments and specifying the hyperparameters, you can directly run the script as:
```bash
bash run_scripts/instrut_blip/train_REVO_LION.sh
```