# Code and Data of REFTOOL: Reference-Guided Tool Creation for Knowledge-Intensive Reasoning

## Code
### Tool Creation
Code related to tool creation is in `code/tool_creation/`.

Take the causality book as an example. Place the LaTeX content of the book in `books/`.
- Extract the book structure: 
```
python extract_book_structure.py --domain causality
```
- Generate tools:
```
python initial_tool_generation.py --model_name gpt4o --domain causality
```
- Validate tools:
```
python validate_tools.py --model_name gpt4o --domain causality --stage unfiltered
```
- Refine tools:
```
python refine_tools.py --model_name gpt4o --domain causality
```
- Validate again:
```
python validate_tools.py --model_name gpt4o --domain causality --stage refined
```
### Inference
Code related to inference (including the PoT baseline and tool utilization) and evaluation is in `code/inference/`.

#### PoT baseline
Because PoT + RefTool degrades to PoT when none of the tools are selected, we first run inference with PoT.
```
python run_pot_0shot.py --model_name gpt4o --domain causality
```

#### Tool Utilization
- Chapter selection:
```
python select_chapter.py --model_name gpt4o --domain causality
```
- Tool selection within chapter:
```
python select_skills_by_chapter.py --model_name gpt4o --domain causality
```
- Solution generation:
```
python run_tool_0shot.py --model_name gpt4o --domain causality
```

#### Evaluation
```
python evaluator.py --model_name gpt4o --domain causality --method tool_0shot --force_generate
```

## Evaluation Data
Evaluation questions for causality, physics, and chemistry are in `evaluation_data/qrdata_causal.json`, `evaluation_data/theoremqa_phy.json`, and `evaluation_data/scibench_chem.json`, respectively.

For causality, please also download the corresponding data from the original benchmark `https://github.com/xxxiaol/QRData/blob/main/benchmark/data.zip`. Unzip and place the `data/` directory under `evaluation_data/`.

We do not provide the LaTeX files of reference materials because of intellectual copyright.