
## 💨 Quick Eval

### Environment Setup
To run the baseline code you need to install the following dependencies:
```bash
conda create -n bird_critic python=3.10 -y
conda activate bird_critic
pip install -r requirements.txt
```

### Generation
You also need to setup the model name (eg., **gpt-4o-2024-08-06**) with the API key in the `config.py` file. Then you can run the following command to generate the output:
```bash
# Generate the prompt
cd baseline/run
bash generate_prompt.sh

# LLM Inference, need to set the API key in config.py
bash run_baseline.sh
```
The output will be save in the [`./baseline/outputs/final_output/`](./baseline/outputs/final_output/)


### Evaluation
We use **docker** to provide a consistent environment for running the benchmark. To set up the environment, follow these steps:

1. First download the PostgreSQL, MySQL, SQL Server and Oracle database.
2. Unzip the folder and save it in the [`./evaluation`](./evaluation) named with postgre_table_dumps,mssql_table_dumps, mysql_table_dumps and  oracle_table_dumps.
3. Build the docker compose
```bash
cd evaluation
docker compose up --build
```
4. Interact with the database
You can use the `perform_query_on_{dialect}_databases()` function in the `evaluation/src/{dialect}_utils.py` file to interact with the each database. The function will return the result of the query.
5. Run the evaluation script inside the so_eval_env container
```bash
docker compose exec so_eval_env bash
cd run
bash run_eval.sh 
```
You have to specify the dialect you want to evaluate in the `run_eval.sh` script. The options are:
- `postgresql`
- `mysql`
- `sqlserver`
- `oracle`
The output report file will be saved in the same folder as your input file. 
If you want the log file for each instance, you can set the `--logging` to `true` in the `run_eval.sh` script.