# DeepEvolve

A search and coding agent for new algorithm discovery in different science domains.

## Comparison: DeepEvolve vs AlphaEvolve

| Feature               | AlphaEvolve                                      | DeepEvolve                                                                 |
|-----------------------|--------------------------------------------------|-----------------------------------------------------------------------------|
| **Knowledge Base**     | Relies on the LLM's internal knowledge           | **Broader knowledge**: retrieves information from the Internet              |
| **Code Evolution Scope** | Evolves up to hundreds of lines of code        | **Multi-file evolution**: handles entire codebases, not just single files   |
| **Debugging Support**   | No debugging                                     | **Automatic debugging**: executes and fixes code during each iteration      |
| **Domain Application**  | Applied primarily to math                        | **Wider domain support**: applicable to math, chemistry, biology, materials, and more |

DeepEvolve also inherits many strengths from AlphaEvolve, including:

- The use of state-of-the-art LLMs
- Long-horizon evaluation with GPU acceleration
- Rich contextual prompting and feedback
- The ability to optimize multiple metrics simultaneously

Beyond code evolution, **DeepEvolve extends Deep Research through idea evolution driven by evaluation**, where:

- New ideas are generated by *“standing on the shoulders of giants”*—drawing inspiration from previously explored ideas in the database.
- Each research idea has a clear evolutionary trajectory, showing how it evolves through continuous evaluation and refinement.

---

## Installation

### Create an environment
```bash
conda create --name deepevolve python=3.9.21
conda activate deepevolve
```
### Install dependencies

Choose one of the following options:

**Option 1: Full installation** (recommended for running projects in the `examples` folder)
```bash
pip install -r requirements.txt
```

**Option 2: Minimal installation** (for custom projects and the circle packing examples)
```bash
pip install -r requirements-mini.txt
```

For the minimal installation, you'll need to add any additional packages required by your specific project.

---

## Usage

Run DeepEvolve on the circle-packing example:

```bash
python deepevolve.py \
    query="'You are an expert mathematician. Your task is to improve an algorithm that maximizes the sum of circle radii in the circle-packing problem within a unit square, using between 26 and 32 circles. Do not develop neural-network-based models. The algorithm must produce exact, valid packings that satisfy these constraints: circles not overlap and must remain entirely within the square.'" \
    problem="circle_packing"
```

* `query`: user instructions.
* `problem`: folder name in the `examples` directory
* More parameters can be found in `configs/config`. Common settings include `workspace` (defaults to `"examples"`), `checkpoint` (defaults to `"ckpt"`)

DeepEvolve is built on the OpenAI Agents SDK. Set `OPENAI_API_KEY` in your environment. 

Results are written to
`{workspace}/{problem}/{checkpoint}/best` (best run) and periodic checkpoints in the same `{workspace}/{problem}/{checkpoint}/checkpoint_{i}` (frequency set by `checkpoint_interval`).
Example outputs are included under `examples/circle_packing/ckpt`.

---

## Adding a New Problem

1. Inside the workspace (default: `examples`), create a folder named after the problem.

2. Place your starter code in an `initial_code` subfolder.

3. Add an `info.json` file:

   ```json
   {
     "problem": {
       "name": "problem name",
       "description": "description of the problem",
       "metric": "description of the metric",
       "interface": "deepevolve_interface.py"
     },
     "initial_idea": {
       "title": "initial idea title",
       "content": "description or link to the idea",
       "supplement": "description or link to extra material"
     }
   }
   ```

4. In `initial_code`, write `deepevolve_interface.py` that defines:

   ```python
   def deepevolve_interface() -> tuple[bool, dict | str]:
       """
       Returns:
           success (bool): True if run finished without error.
           result: metric dict (must include "combined_score") or error text.
       """
   ```

   **The metric dictionary guides optimization; a higher `combined_score` is better.**
   You can include other metrics (floats or strings), which will also be used to instruct the LLMs.
   A simple example for `deepevolve_interface.py` is:

   ```python
   import traceback
   from time import time
   import warnings

   # import the main function in the initial code
   # from main_file import main_func

   def deepevolve_interface():
       try:
           with warnings.catch_warnings(record=True) as caught:
               warnings.simplefilter("always")
               start_time = time()
               eval_score = main_func(args)
               runtime = time() - start_time

           warning_messages = [str(w.message) for w in caught]

           runtime = round(runtime / 60, 2)
           metrics = {
               "combined_score": eval_score,
               "runtime_minutes": runtime,
           }

           if warning_messages:
               warning_messages = list(set(warning_messages))
               if len(warning_messages) > 10:
                   warning_messages = warning_messages[:10]
               metrics["program_warnings"] = warning_messages
           return True, metrics
           
       except Exception as e:
           # Capture full traceback information
           error_traceback = traceback.format_exc()
           error_info = f"""
           Error type: {type(e).__name__}
           Error message: {str(e)}
           Traceback: {error_traceback}
           """
           return False, error_info
   ```
   You are welcome to check the examples for different definitions of the interface file and the `deepevolve_interface()` function.

**(Optional) Dataset Preparation:** Many scientific projects require training and evaluating deep learning models on different datasets. We save these datasets in the `data_cache` folder. If you are running one of the provided example projects, you can prepare the dataset by running the corresponding Python script in the `data_cache` folder. For example, you can `cd data_cache` and then run `python {problem_name}.py`.