### Dataset Preparation


### Evaluation

We provide a unified evaluation framework for whole proof generation methods. The evaluation process consists of the following steps:

#### Step 1: Proof Generation
```bash
cd evaluation
python generation.py --prover_name deepseek_v15_rl --gpu 4 --dataset_path "LeanBenchmark" --n 32
```

#### Step 2: Evaluation Setup
We utilize a modified version of kimina-lean-server (adapted for our evaluation environment) with Lean version 4.18.0:
```bash
cd kimina-lean-server
pip install -e .
cp .env.template .env
bash setup.sh 
bash setup_local.sh
```

#### Step 3: Running Evaluation
```bash
cd ..
python eval.py --input_file file_name
```