This code repository supplements the NeurIPS 2024 Submission ID 20983, [Trace is the new AutoDiff: Unlocking Efficient Optimization of Computational Workflows](https://openreview.net/forum?id=rYs2Dmn9tD). This repository implements 

1. The `Trace` framework that converts a computational workflow optimization problem into an OPTO (Optimization with Trace Oracle) instance.
2. The `OptoPrime` optimizer that iteratively optimizes OPTO problems (in the code, we refer to it as `FunctionOptimizerV2Memory`).
3. Jupyter notebooks illustrating the functionalities provided by `Trace` and `OptoPrime`.
4. Code to replicate all of the experiments reported in the paper.
5. Experiment logs and results to reproduce the plots in the paper.

The repository is structured as a fork of the [AutoGen](https://github.com/microsoft/autogen) package.

To install this repository, use the following commands in a new Python virtual environment. 

```bash
pip install -e .[trace]
```

## Documentation
To run any of the Jupyter notebooks or experiment scripts, you will need to create a file called `OAI_CONFIG_LIST`. Please see the sample provided in this repository.

Tutorial notebooks are provided in `autogen\notebook\trace`
To get started, please work through the `BasicsTutorial` and `OptimizationTutorial`.


## Reproducing Experiments

All experiment scripts are in `autogen\exp`

### Battleship (Section 1.2)

To replicate the experiments of Section 1.2 (detailed in Appendix B.1) please run

```bash
python battleship_exp.py --n_eval_episodes 20 --optimizer OPRO --seed 3
python battleship_exp.py --n_eval_episodes 20
```

### Numerical Optimization (Section 5.1)
To replicate the experiments of Section 5.1 (detailed in Appendix B.2) please run

```bash
python run_number_synthetic.py --n 30 --steps 10
python run_number_synthetic.py --n 30 --setup masked --steps 10
python run_number_synthetic.py --n 30 --setup opro --steps 10
python run_number_synthetic.py --n 30 --setup agent --steps 10
python run_number_synthetic.py --n 30 --setup torch --steps 10
```

### Traffic Control (Section 5.2)
To replicate the experiments of Section 5.2 (detailed in Appendix B.3) please run

```bash
python Traffic_Experiments.py
```

It will take a few hours and approximately `$200` to complete `20` replications. 

Previous seeded runs have been pickled and included in the `Traffic_experiment_results.zip`

To reproduce the plots in the paper without recomputing LLM calls, simply unzip the pickled results before running the script above.

### BigBench-Hard (Section 5.3)
To replicate the experiments of Section 5.3 (detailed in Appendix B.4) please run

```bash
python run_prompt_bigbench_dspy.py --task_start 0 --task_end 27
```

### MetaWorld (Section 5.4)
Note that MetaWorld needs [LLF-Bench](https://github.com/microsoft/LLF-Bench) to be installed. 
Please follow the instructions in that repository to install LLFBench-MetaWorld.

An example run of a single configuration is
```bash
python metaworld_exp.py --logdir tmp --seed 0 --n_optimization_steps 30  --env_name llf-metaworld-pick-place-v2 --verbose --memory 10
```

To replicate the experiments of Section 5.4 (detailed in Appendix B.5) please run

```bash
python run_mw_exps.py
```

Previous seeded runs have been pickled and included in the `trace_mw_exps.zip`

To reproduce the plots in the paper without recomputing LLM calls, simply unzip the pickled results and run

```bash
python plot_mw_results.py
```
