# README 





## Data Synthesization 

We provide the templates we used in `prompt_templates.py` and the data synthesization code in `build_text_qa.py` 

## Model Training

Please refer to the [LongVA](https://github.com/EvolvingLMMs-Lab/LongVA) code repo for environment setup and training script.

Specifically, please prepare the LLaVA-Next dataset and replace the json with our mix json (below) to start training.

## Example Dataset

We provide a 2k subset (due to size limit) in `t3_mix_2k_subset.json` .