
# M2R: Micro–Macro Retrieval Framework

This repository contains code and scripts for training and evaluating our micro–macro retrieval framework.  
All steps are provided for reproducibility. Sensitive paths and personal information have been removed for blind review.

---

## 1. Environment Setup
pip install -r requirements.txt

---

## 2. Start Sandbox Serving

```bash
tmux new -s M2R
cd M2R/scripts/serving
python sandbox.py --port 2501
```

---

## 3. Start Retriever Serving

```bash
conda activate sglang
cd ./scripts/serving
python retriever_serving.py \
    --config retriever_config.yaml \
    --num_retriever 1 \
    --port 2502
```

---

## 4. Training

Training is organized into **Stage 1 (macro retrieval)** and **Stage 2 (macro retrieval + micro retrieval)**.
Example launch scripts are provided under `./scripts/train`.

Example (8 GPUs):

```bash
nohup bash train_ours.sh \
    --train_batch_size 8 \
    --ppo_mini_batch_size 2 \
    --use_re_call True \
    --prompt_template_name re_call_template_sys \
    --actor_model_path <backbone_model_path> \
    --search_url http://0.0.0.0:2502 \
    --sandbox_url http://0.0.0.0:2501 \
    --project_name <project_name> \
    --experiment_name <exp_name> \
    --nnodes 1 \
    --n_gpus_per_node 8 \
    --save_freq 500 \
    --test_freq 500 \
    --total_epochs 2 \
    --wandb_api_key None \
    --save_path ./save_dir_<exp_name> \
    --train_files "['<data_path>/train.parquet']" \
    --test_files "['<data_path>/test.parquet']" \
    --checkpoint_save ./checkpoints_<exp_name> \
    >> ./train_<exp_name>.log 2>&1 &
```

* Stage 1 focuses on macro retrieval with structured evidence saving.
* Stage 2 continues training from Stage 1 checkpoints, adding micro retrieval.

---

## 5. Checkpoint Merging

After training, merge distributed checkpoints into a final model directory:

```bash
cd ./scripts/train
python model_merger.py merge \
    --backend fsdp \
    --local_dir ./save_dir_<exp_name>/global_step_xxx/actor \
    --target_dir <target_model_dir>
```

Make sure to copy `config.json`, `vocab.json`, and other required config files into `<target_model_dir>`.

---

## 6. Inference

Serve the trained model with [sglang](https://github.com/sgl-project/sglang):

```bash
python3 -m sglang.launch_server \
    --served-model-name <model_name> \
    --model-path <model_path> \
    --tp 8 \
    --context-length 8192 \
    --enable-metrics \
    --dtype bfloat16 \
    --host 0.0.0.0 \
    --port 30001 \
    --trust-remote-code \
    --disable-overlap-schedule \
    --disable-radix-cache
```

Adjust `--tp` according to the number of GPUs.

---

## Notes

* Replace all `<...>` placeholders (paths, names) with your local configuration.
* Training logs and results will be stored in the corresponding `save_dir` and `checkpoints` folders.
* The pipeline is designed to be reproducible without manual labeling of reasoning chains.


