## 🚀 Quick Start

Get up and running immediately with the included demo agent (`simple_sum` task).

### 1. Install & Run
**Using `uv` (Recommended)**
```bash
uv sync
uv run python run_exp.py
```

**Using `pip`**
```bash
pip install .
python run_exp.py
```

### 2. Custom Run
Override default configuration via CLI flags:
```bash
python run_exp.py --task simple_sum --num-test 5 --model-name gpt-4o-mini
```

---

## 🛠️ Installation

### Prerequisites
- **Python 3.11** or higher

### Option A: Using `uv` (Fastest)
[uv](https://github.com/astral-sh/uv) is a fast Python package installer and resolver.
```bash
uv sync
```

### Option B: Using `pip`
Standard Python installation.
```bash
pip install .
```

---

## 📂 Project Structure

```text
.
├── agents/             # 🤖 Agent implementations
│   ├── base/           # Abstract base classes for agents
│   └── demo/           # Example agent implementation
├── exp/                # 🧪 Experiment definitions (Tasks)
│   ├── base/           # Abstract base classes for experiments
│   └── demo/           # Example experiment implementation
├── utils/              # 🛠️ Utilities (Logging, Config, Metrics)
├── config.yaml         # ⚙️ Default configuration
└── run_exp.py          # 🏁 Main entry point
```

---

## ⚙️ Configuration

The `config.yaml` file controls the default experiment settings. You can modify this file directly or override values via the CLI.

| Parameter | Type | Description | Default |
|-----------|------|-------------|---------|
| `dataset_name` | `str` | Name of the dataset directory in `exp/` | `demo` |
| `task` | `str` | Registered name of the experiment task | `simple_sum` |
| `num_test` | `int` | Number of test iterations to run | `10` |
| `model_name` | `str` | Model identifier (e.g., `gpt-4o-mini`) | `test_model` |
| `agent.name` | `str` | Registered name of the agent to use | `demo` |
| `agent.params` | `dict` | Optional parameters for the agent | `{}` |

---

## 🧩 Extending the Framework

### Adding a New Agent

1.  **Create Directory:** `agents/<your_agent_name>/`
2.  **Create File:** `agents/<your_agent_name>/agent.py`
3.  **Implement:** Inherit from `AgentBase` and use the `@register_agent` decorator.

```python
from agents.base.agent import AgentBase
from exp.utils.registry import register_agent
import json

@register_agent("my_custom_agent")
class MyAgent(AgentBase):
    async def query(self, prompt, data, logs_dir, query_id):
        # Implement your agent logic here
        result = "processed_result"
        return json.dumps({"output": result})
```

### Adding a New Experiment

1.  **Create Directory:** `exp/<dataset_name>/`
2.  **Create File:** `exp/<dataset_name>/<task_name>.py`
3.  **Implement:** Inherit from `ExperimentBase` and use the `@register_experiment` decorator.

```python
from exp.base.base import ExperimentBase
from exp.utils.registry import register_experiment

@register_experiment("my_new_task")
class MyExperiment(ExperimentBase):
    def prepare_data(self):
        # Load or preprocess data
        pass

    def data_iterator(self):
        # Yield data items one by one
        yield {"id": 1, "input": "..."}

    async def run_agent(self, data):
        # Define how the agent interacts with the task
        pass

    def calculate_metrics(self, result_list):
        # Return accuracy/metrics
        return {"Accuracy": 0.95}
```
