# Agent Distillation

`agent-distillation` is a library that supports **distillation** of large language agents into small langauge models, with just a few scripts!

Built on top of [`smolagents` v1.13.0.dev0](https://github.com/huggingface/smolagents), this library supercharges the agent training pipeline with essential utilities for logging, training, and benchmarking, all optimized for simplicity and reproducibility.

## 🔧 What This Library Offers

On top of all the awesome tools provided by `smolagents`, `agent-distillation` adds:

1. 📜 **Logging**: Seamlessly save agent run logs to create training-ready trajectories.
2. 🎓 **Training**: Use [TRL](https://github.com/huggingface/trl)'s SFT trainer to train small agents that remain compatible with `smolagents`.
3. 📊 **Benchmarking**: Evaluate your distilled agents on factual and mathematical reasoning benchmarks using a single script.


## 📦 Contents

1. [Installation](#installation)
2. [Quickstart: How to Distill Agents](#quickstart-how-to-distill-agents)
3. [Acknowledgements](#acknowledgements)


## 🛠 Installation

To install with the required libraries:

```bash
conda create -n agents python=3.12
conda activate agents
pip install -e .[distill]
```

### ➕ Optional: Retriever Environment (used in our paper)

Want to reproduce or extend our retriever-enhanced setup? We follow the [Search-R1](https://github.com/PeterGriffinJin/Search-R1) environment.

Expand the section below for setup instructions.
<details>
<summary>Open for the detailed setup guideline.</summary>

1. Make a conda environment for the retriever.

```bash
conda create -n retriever python=3.10
conda activate retriever
```

2. Install related libraries.

```bash
conda install pytorch==2.4.0 torchvision==0.19.0 torchaudio==2.4.0 pytorch-cuda=12.1 -c pytorch -c nvidia
pip install transformers datasets pyserini
conda install -c pytorch -c nvidia faiss-gpu=1.8.0
pip install uvicorn fastapi
```

3. Save the index and corpus from the repo.

```bash
save_path=./search/database/wikipedia
mkdir -p $save_path
python scripts/download.sh --save_path $save_path
cat $save_path/part_* > $save_path/e5_Flat.index
gzip -d $save_path/wiki-18.jsonl.gz
```

</details>

## ⚗️ Quickstart: How to Distill Agents

All scripts assume access to 4 GPUs.

1. 🧪 Generate Trajectories from Teacher Agent

```bash
bash scripts/inference/run_agent_teacher_train.sh
```

2. 🎓 Train the Student Agent

```bash
bash scripts/training/train_agent.sh Qwen/Qwen2.5-1.5B-Instruct
```

3. ✅ Evaluate the Trained Agent on Benchmarks

Runs with self-consistent action generation enabled by default:

```bash
bash scripts/run_agent_student.sh Qwen/Qwen2.5-1.5B-Instruct training_outputs/qwen-1.5B-instruct/agent_baseline_qwen2.5_32B_teacher
```

## Running the Distilled Agents

Curious about more capabilities? Check out the [original smolagents instructions](./README_smolagents.md) for advanced usage and custom environments.


## 🙏 Acknowledgements

This project is made possible by the foundational work of the following open-source libraries:

- [**smolagents**](https://github.com/huggingface/smolagents): Provides the core framework for building and running lightweight language agents, which we extend for distillation.

- [**Search-R1**](https://github.com/PeterGriffinJin/Search-R1): Supplies a dense retrieval environment used in our retriever-based experiments.

- [**TRL**](https://github.com/huggingface/trl): Offers the supervised fine-tuning framework we use to train distilled agents effectively.

We sincerely thank the developers and maintainers of these projects.
