# ACE-RAG

This artifact reproduces ACE-RAG on Wikipedia-style QA benchmarks.
It focuses on ACE-RAG outputs, ablations, and evaluation protocol, not on
regenerating every baseline from scratch.

Document/page titles are used as canonical doc anchors.
The default doc-anchor mode is `title`, matching the submitted paper.
Title-free doc-anchor induction is outside the scope of this artifact and left for future work.

Public datasets and model checkpoints are not included.
Users must obtain datasets from their original sources.
Baseline values are reference values unless raw baseline predictions are explicitly provided.
Common-prompt and native-prompt protocols are separate evaluation protocols, not arbitrary tuned variants.
Official paper protocols are specified in `configs/`.

## Artifact Scope

```text
main.py
utils/                  ACE-RAG indexing, retrieval, generation, and evaluation code
configs/                official paper protocol configs
scripts/                paper reproduction and validation entry points
README.md
REPRODUCE.md
environment.yml
requirements.txt
requirements.lock.txt
```

Official configs:

- `configs/protocol_common_prompt.yaml`: main common-prompt ACE-RAG protocol.
- `configs/protocol_native_prompt.yaml`: appendix native-prompt ACE-RAG protocol.
- `configs/ablation_local_refinement.yaml`: local refinement ablation settings.
- `configs/smoke_test.yaml`: small validation config.

## Environment Setup

```bash
conda env create -f environment.yml
conda activate ace-rag
pip install -r requirements.txt
```

`environment.yml` also installs `requirements.txt`. The explicit `pip install`
line is included so reviewers can refresh the Python dependencies after
creating the environment.

## Dataset Layout

Download the public benchmarks from their original sources, then arrange them
under one data root:

```bash
python scripts/prepare_data_layout.py --src /path/to/downloaded/files --dst /path/to/data
export ACE_RAG_DATA_DIR=/path/to/data
export ACE_RAG_OUTPUT_DIR=/path/to/output
```

Expected files:

```text
${ACE_RAG_DATA_DIR}/popqa/popqa.json
${ACE_RAG_DATA_DIR}/popqa/popqa_corpus.json
${ACE_RAG_DATA_DIR}/hotpotqa/hotpotqa.json
${ACE_RAG_DATA_DIR}/hotpotqa/hotpotqa_corpus.json
${ACE_RAG_DATA_DIR}/2wiki/2wikimultihopqa.json
${ACE_RAG_DATA_DIR}/2wiki/2wikimultihopqa_corpus.json
${ACE_RAG_DATA_DIR}/musique/musique.json
${ACE_RAG_DATA_DIR}/musique/musique_corpus.json
```

## Model Setup

Serve Qwen2.5-7B-Instruct through an OpenAI-compatible vLLM endpoint:

```bash
vllm serve Qwen/Qwen2.5-7B-Instruct \
  --gpu-memory-utilization 0.55 \
  --tensor-parallel-size 1 \
  --max_model_len 8192 \
  --port 8013
```

Configure the generator endpoint:

```bash
export ACE_RAG_LLM_BASE_URL=http://localhost:8013/v1
export VLLM_API_KEY=EMPTY
```

Download or provide a local NV-Embed-v2 checkpoint:

```bash
huggingface-cli download nvidia/NV-Embed-v2 \
  --local-dir /path/to/nvidia/NV-Embed-v2
export NVEMBED_MODEL_PATH=/path/to/nvidia/NV-Embed-v2
```

NV-Embed-v2 is not included in the archive. The embedding loader uses local
files, so the model must already exist at `NVEMBED_MODEL_PATH`.

## Preflight Checks

```bash
python scripts/check_env.py
python scripts/check_vllm.py
python scripts/check_nvembed.py
```

## Smoke Test

```bash
python -m compileall main.py scripts utils
python main.py --help
bash scripts/run_smoke_all.sh 3
```

## Paper Protocol Reproduction

```bash
bash scripts/run_paper_common_prompt.sh hotpotqa
bash scripts/run_paper_native_prompt.sh hotpotqa
```

Use the `LIMIT` environment variable for paper-scale or smaller runs:

```bash
LIMIT=1000 bash scripts/run_paper_common_prompt.sh 2wiki
LIMIT=1000 bash scripts/run_paper_native_prompt.sh 2wiki
```

## Ablation Reproduction

```bash
bash scripts/run_paper_ablation.sh hotpotqa
```

## Release Check

For a small model-enabled end-to-end check:

```bash
ACE_RAG_LLM_BASE_URL=http://localhost:8013/v1 bash scripts/run_release_reproduction_check.sh
```

## Excluded Assets

Public datasets, model checkpoints, generated outputs, logs, caches, PDFs,
paper sources, and figures are not included. Baseline values are reference
values unless raw baseline predictions are explicitly provided.
