# ALDEN

[//]: # ([![GitHub Repo stars]&#40;https://img.shields.io/github/stars/hiyouga/EasyR1&#41;]&#40;https://github.com/hiyouga/EasyR1/stargazers&#41;)

[//]: # ([![Twitter]&#40;https://img.shields.io/twitter/follow/llamafactory_ai&#41;]&#40;https://twitter.com/llamafactory_ai&#41;)

## Document VQA

### Dataset Preprocessing

#### Corpus Building

Change the raw data path and the target path in `rag_serving/build_corpus.py`

```shell
python rag_serving/build_corpus.py
```

A created training corpus can be found here: SkyFishQ/ALDEN in HuggingFace platform.

#### Image Index Building

```shell
cd ./flashrag/flashrag/retriever
python index_builder.py --retrieval_method vdr-2b-v1 --model_path llamaindex/vdr-2b-v1 --corpus_path /scratch-scc/projects/scc_ulsb_fe/yang/images_corpus/images.parquet --save_dir /scratch-scc/projects/scc_ulsb_fe/yang/images_index --max_length 512 --batch_size 128 --faiss_type Flat --index_modal image --sentence_transformer  --save_embedding
```

### Launch RL

#### Tool Environment Serving

1. Get the IP address of the server

    ```shell
    hostname --ip-address
    ```

2. Start serving

    ```shell
    python rag_serving/serving.py --config rag_serving/serving_config.yaml --num_retriever 4 --port 42354
    ```

#### RL Training

```shell
bash examples/baselines/qwen2_5_vl_7b_doc_agent_ppo.sh
```

#### Inference

```shell
bash examples/baselines/qwen2_5_vl_7b_doc_agent_generation.sh
```

