
# Prover Agent: An Agent-Based Framework for Formal Mathematical Proofs


<h2>
  <div>🚀 Quick Start</div>
    <img width="80%" height="8px" src="docs/_images/line.svg" />
</h2>


### 1. Move to this repository

```bash
cd Prover-Agent
```

### 2. Set up the Lean 4 environment

1. Install Lean 4 by following the [official installation guide](https://lean-lang.org/install/).

2. Initialize the Lean workspace:

```bash
cd lean_workspace
lake exe cache get
lake update
lake build # Check that the environment is set up successfully
cd ..
```


### 3. Install Python dependencies

```bash
# "data" for dataset preparation and "server" for running the LLM servers
pip install -e '.[data,server]'
```

### 4. Prepare the MiniF2F dataset

```bash
python scripts/prepare_minif2f.py
```
This will create a `data/miniF2F/test.json` file containing the processed MiniF2F dataset.

### 5. Configure the LLM server for your environment

Modify `serve.sh` (for launching the vLLM server) and `server_config.yml` (for the LiteLLM proxy) according to your machine specifications.
The provided examples are configured for 8 × 40 GB A100 GPUs, which were used in our experiments.

### 6. Start the LLM servers and run Prover Agent on the MiniF2F benchmark

Run the following script, which is provided as `run_minif2f.sh`:

```bash
#!/bin/bash

# Start the vLLM server
./serve.sh &
sleep 180 # Wait for the server to be ready

# Start the LiteLLM proxy server
litellm --config server_config.yml &
sleep 10 # Wait for the server to be ready

# Run Prover Agent on the MiniF2F benchmark
python dispatch_benchmark.py \
  --benchmark miniF2F \
  --phase test \
  --num_workers 16 # Adjust according to your machine specifications
```

### 7. Check the results
The results will be saved in the `runs/` directory, which will be created automatically.


> [!IMPORTANT]
> We strongly recommend double-checking the resulting proofs manually to avoid potential errors that may not be caught by the program.

> [!NOTE]
> The formalizer model is not frequently used in the workflow, so separating its dispatch stage instead of deploying it throughout the entire workflow can help save computational resources and improve throughput. See the [usage guide](docs/usage_guide.md) for more information.
