
## Datasets

### ConstructiveBench

- **Location**: `data/dataset/constructivebench.json`  
- **Description**: Curated Olympiad-style problems with metadata and aligned Lean formalizations. Each entry includes:
  - Problem statement
  - Category (e.g., Algebra, Combinatorics)
  - Formal answer in Lean
  - Full formal theorem
  - Answer-construction alignment parts (header, answer, theorem with/without answer)

```json
{
  "name": "IMO2011SLC4",
  "category": "Combinatorics",
  "source": "IMO/2011",
  "problem": "...",
  "answer": "The greatest such number k is 3",
  "formalization": "...",
  "...": "..."
}
```
Below, we show the problem sources and problem domains in ConstructiveBench.
![Domains and Categories](assets/dataset.png)



## Requirement

- Python 3.10  
- Lean Proof Assistant 



## Environment Setup
**Create a Python Virtual Environment**  
   We recommend using `venv`:
   ```bash
   python -m venv imosolver
   source imosolver/bin/activate
   pip install -r requirements.txt
   ```

5. **Build Lean Environment**  
   Build for both the newest version v4.23.0 and `v4.9.0-rc1` (for prover models). This may take around 30 minutes. 
   Due to supplementary material limit, we put the mathlib dependencies of Goedel-Prover, Kimina-Prover and DeepSeek-Prover to the github link below. This repo does not belong to and is irrelevant to the authors. 

   ```bash
   cd Formalization
   lake update
   lake build Main
   cd ..

   cd prover
   git clone https://github.com/xinhjBrant/mathlib4.git
   cd /mathlib4
   lake build
   cd ../..
   ```

6. **Set Up LLM API Keys**  
   Either add them to your shell file or edit `appl.yaml` directly:
   ```bash
   echo 'export OPENAI_API_KEY="your_openai_key_here"' >> ~/.bashrc
   echo 'export DEEPSEEK_API_KEY="your_deepseek_key_here"' >> ~/.bashrc
   source ~/.bashrc
   ```






## Example Runs

The file `src/ecp/main.py` provides a unified interface for three pipelines:

1. **`answer_gen`**: Full ECP pipeline (Enumerate → Conjecture → Verify)  
2. **`autoformalize`**: Generate Lean formalizations from informal problems/answers  
3. **`proof_gen`**: Use a formal prover to generate complete Lean proofs  

### 1. Input Dataset

- Use `--problem_path` to specify the dataset.  
  - Main option: `constructivebench` (recommended)  
  - For testing: `test` (runs a single case)

### 2. Choosing the Pipeline

- Set `--mode` to one of:
  - `answer_gen`  
  - `autoformalize`  
  - `proof_gen`  

### 3. Key Flags

- `--enable_enumerator`:  
  - `True`: Run full ECP (enumerator + conjurer)  
  - `False`: Skip enumeration (Chain-of-Thought baseline)  
- `--problem_name`:  
  - `"all"` (default) to process all entries  
  - Or a comma-separated list of specific problem names  



### Example Commands

#### A. Run Autoformalization

```bash
python src/ecp/main.py --mode autoformalize --problem_path constructivebench
```

#### B. Run Answer-Generation (Enumerate-Conjecture)

```bash
python src/ecp/main.py --mode answer_gen --problem_path constructivebench --enable_enumerator true
```
> **Output Location**:  
> `output/data/dataset/constructivebench.json/deepseek-chat-code/`  
> (*To run the CoT baseline, set `--enable_enumerator false`.*)

#### C. Run Proof-Generation (Prove)

After generating formalizations and conjectures (via `answer_gen`), run:

```bash
python src/ecp/main.py --mode proof_gen --problem_path constructivebench
```
> **Note**: Proof generation uses Goedel-Prover by default. You can override with models like `deepseek-ai/DeepSeek-Prover-V2-7B` or `AI-MO/Kimina-Prover-Preview-Distill-7B`.



## Default Models & Parameters

- `--enumerator_model`: `deepseek-chat`  
- `--conjecturer_model`: `deepseek-chat`  
- `--prover_model`: `Goedel-LM/Goedel-Prover-SFT`  
- `--max_tokens`: `4096`  
- `--timeout`: `60` (seconds)  
- `--pass_at_n`: `32` (Pass@n metric for proof generation)  
- `--gpu`: `1` (number of GPUs for proof generation)  
- `--use_embedding_search`: `False`  
  (Set to `True` only if you have GPU resources for embedding-based Lean retrieval.)

You can override any of these options. Check `src/ecp/main.py` for the full list of arguments.



## Summarize



## Acknowledgement
The prover part of this repository use code from [Goedel-Prover](https://github.com/Goedel-LM/Goedel-Prover). Some of raw problems from the following datasets were included in ConstructiveBench, as mentioned in the paper:
- [Omni-MATH](https://omni-math.github.io/)
- [OlympiadBench](https://github.com/OpenBMB/OlympiadBench)
- [MathOdyssey](https://github.com/protagolabs/odyssey-math)
