# ALGO

The appendix of the paper is `appendix.pdf`.

We put the prompts, problems, programs and test cases generated in the `ALGO/` folder.

## Infrastructure

We built upon the following open-source projects to automatically prompt ChatGPT code interpreter, collect problem statements from LeetCode and Codeforces, and to submit solutions to LC and CF.

 - [revChatGPT](https://github.com/acheong08/ChatGPT)
 - [python-leetcode](https://github.com/fspv/python-leetcode)
 - [CodeforcesApiPy](https://github.com/VadVergasov/CodeforcesApiPy)
 - [codeforces-problem-scraper-api](https://github.com/kerolloz/codeforces-problem-scraper-api/tree/master)

## Prompts

The prompts and the tags we used for generating the candidate solutions, the oracles, the data generators, the data verifiers are listed in `ALGO/prompts.py` as format strings.

## Results for CodeContests

The results for CodeContests are in `ALGO/codecontests.zip`.

For each problem there are three files. For instance, the files for problem [1575_A](https://codeforces.com/problemset/problem/1575/A) are `1575_A_bf.py`, `1575_A_test.py` and  `1575_A_data.json`.

 - `1575_A_bf.py` contains the reference oracle for the problem, where the entry point is the function `solution`.

 - `1575_A_test.py` contains the data generator for the problem, where the entry point is the function `batch_gen_inputs`.

 - `1575_A_data.json` contains the metadata and example cases for the problem. Specifically, the problem definition is in the `prompt` field, and the test cases generated by ALGO are listed in the `examples_separate` field.

We use the ALGO-generated test cases to sort the solutions generated by Codex (which are directly taken from the [CodeT](https://github.com/microsoft/CodeT) repo) according to the number of tests they pass. For each problem there are about ~1000 candidates. They are listed along with the number of test cases in `sorted_codecontests.json`. Note that they are sorted with up to 20 test cases only.

## The LeetCode Benchmark

The 35 leetcode problems we collected for benchmarking ALGO is listed in `ALGO/leetcode_data/`, where `{problem_id}.md` contains the problem statement and `metadata.json` contains the tags, public example cases and the hints (which we do not use).

## Results for LeetCode Problems

The verifiers for LeetCode problems are in `ALGO/leetcode_results/`. For each problem there is a `{problem_id}_response.json`, where the reference oracle is listed in the `code_bruteforce` field, the data generators are listed in the `data_validator`, `data_generator`, `data_random_generator`, `data_tricky_generator` fields.