# Evaluation

> [!IMPORTANT]
> **General requirements**
>
> Before you start, make sure you have cloned the repository and you are in the **root directory of the project**. Make sure you installed the required packages with `pip install -e .`. Different package versions may impact the reproducibility of the results.

## Running EvalPlus with vLLM

We implemented batched inference in [evaluation/text2code_vllm.py] using [vLLM](https://docs.vllm.ai/en/latest/). This speed up the evaluation significantly: **a greedy decoding run can be finished within 20 seconds**. Here is the command:

```bash
MODEL=/path/to/your/model
DATASET=humaneval # or mbpp
SAVE_PATH=evalplus-$(basename $MODEL)-$DATASET.jsonl
CUDA_VISIBLE_DEVICES=0 python -m evaluation.text2code_vllm \
    --model_key $MODEL \
    --dataset $DATASET \
    --save_path $SAVE_PATH

python -m evalplus.evaluate --dataset $DATASET --samples $SAVE_PATH
```
