The following code is taken from the [Hugging Face Agents](https://github.com/aymeric-roucher/agent_reasoning_benchmark) repository and adapted. We choose to use the version corresponding to the date of the results submission. The one that mostly align with the results.

## SETUP

### Installation

Install the required packages using pip. You can do this by running the following command in your terminal:

```
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```

### Additional Setup

You will need to adjust various files for your needs.

- `gaia.py`
  - Line 27: your Hugging Face API key
  - Line 75: Your api key for OpenAI
  - Line 93 and 116: Choose wether to use local LLMs trough Ollama or OpenAI
  - Line 96-114: Load the benchmark you want to run. Default is GAIA. You can take from our code the SimpleQA benchmark and adapt it to your needs.
  - Line 325: The name of the run
  - Line 348: Adjust output folder

- `scripts\tools\web_surfer.py`
  - Line 27: Your SerpAPI key

- `scripts\tools\visual_qa.py`
  - line 95: You OpenAI API key

## RUN

```
python3 gaia.py
```

## RESULTS

The results of the benchmarks are stored in the `output_gaia` folder.
To obtain the results, follow these steps.

Inside `results.py`:
- Line 11: add you HF token
- Line 104: substitute version with the actual name you used for the run.

```bash
python3 results.py
```

Feel free to adjust the script to your needs
