# GPTSwarm Benchmark


## Setup

**Install GPTSwarm packages**
Go to `mutated_gptswarm` folder

```bash
conda create -n swarm python=3.10
conda activate swarm
pip install poetry
poetry install
```



**Configure LLM and Tools**

To get started, make a copy of the following template files inside the `kgot` directory in KGoT project folder:

- `kgot/config_llms.template.json` → `config_llms.json`
- `kgot/config_tools.template.json` → `config_tools.json`

Please update the API keys in `kgot/config_llms.json` for the language models you intend to use. You can also add new models by incorporating their information into the JSON file.

Configure the Search Engine in `config_tools.json`. The system will automatically select the appropriate search engine based on the following priority:

* Bing API: If `BING_API_KEY` is provided
* Search API: If `SEARCHAPI_API_KEY` is provided
* Google API: If `GOOGLE_API_KEY` is provided

## Quick Start

> [!WARNING]
> Please run all these commands in the root folder of KGoT project.

### Run Single Dataset

#### GAIA Dataset

Run GAIA benchmark with specific parameters:

```bash
python benchmarks/baselines/gptswarm/gptswarm_gaia.py \
  --gaia_file <path_to_gaia_json> \
  --attachment_folder benchmarks/datasets/GAIA/attachments/validation \
  --log_folder_base <log_directory> \
  --llm_model gpt-4o-mini \
  --max_iterations 7
```

Parameters:
- `--gaia_file`: Path to GAIA JSON file (required)
- `--attachment_folder`: Path to GAIA problems attachments folder
- `--log_folder_base`: Directory for storing logs (required)
- `--llm_model`: LLM model to use (default: gpt-4o-mini)
- `--max_iterations`: Maximum iterations for GPTSwarm (default: 3)
- `--config_llms_path`: Path to LLM configuration file
- `--config_tools_path`: Path to tools configuration file

#### SimpleQA Dataset

Run SimpleQA benchmark:

```bash
python benchmark/GPTSwarm/gptswarm_simpleqa.py \
  --simpleqa_file <path_to_simpleqa_json> \
  --log_folder_base <log_directory> \
  --llm_model gpt-4o-mini \
  --max_iterations 7
```

Parameters:
- `--simpleqa_file`: Path to SimpleQA JSON file (required)
- `--log_folder_base`: Directory for storing logs (required)
- `--llm_model`: LLM model to use (default: gpt-4o-mini)
- `--max_iterations`: Maximum iterations for GPTSwarm (default: 3)
- `--config_llms_path`: Path to LLM configuration file
- `--config_tools_path`: Path to tools configuration file

### Run Multiple Datasets

#### Multiple GAIA Levels

Run benchmarks across multiple GAIA difficulty levels:

```bash
bash benchmark/GPTSwarm/run_multiple_gptswarm_gaia.sh \
  --log_folder_base logs/gptswarm_gaia_test \
  --llm_model gpt-4o-mini \
  --max_iterations 7
```

This script runs tests on level 1, 2, and 3 GAIA datasets and generates performance plots.

Options:
- `--log_folder_base`: Directory for logs (default: logs/gptswarm_gaia_<model_name>)
- `--attachment_folder`: Path to attachments (default: GAIA/dataset/attachments/validation)
- `--max_iterations`: Maximum iterations (default: 7)
- `--llm_model`: LLM model to use (default: gpt-4o-mini)

#### Multiple SimpleQA Tests

Run benchmarks on SimpleQA dataset:

```bash
bash benchmark/GPTSwarm/run_multiple_gptswarm_simpleqa.sh \
  --log_folder_base logs/gptswarm_simpleqa_test \
  --llm_model gpt-4o-mini \
  --max_iterations 7
```

Options:
- `--log_folder_base`: Directory for logs (default: logs/gptswarm_simpleqa_<model_name>)
- `--max_iterations`: Maximum iterations (default: 7)
- `--llm_model`: LLM model to use (default: gpt-4o-mini)