# Agent-Driver

This is a repo of Agent-Driver

Note: Running Agent-Driver requires an [OpenAI API account](https://platform.openai.com/)

## Introduction

Human-level driving is an ultimate goal of autonomous driving. Conventional approaches formulate autonomous driving as a perception-prediction-planning framework, yet their systems do not capitalize on the inherent reasoning ability and experiential knowledge of humans. In this paper, we propose a fundamental paradigm shift from current pipelines, exploiting Large Language Models (LLMs) as a cognitive agent to integrate human-like intelligence into autonomous driving systems. Our approach, termed Agent-Driver, transforms the traditional autonomous driving pipeline by introducing a versatile tool library accessible via function calls, a cognitive memory of common sense and experiential knowledge for decision-making, and a reasoning engine capable of chain-of-thought reasoning, task planning, motion planning, and self-reflection. Powered by LLMs, our Agent-Driver is endowed with intuitive common sense and robust reasoning capabilities, thus enabling a more nuanced, human-like approach to autonomous driving. We evaluate our approach on the large-scale nuScenes benchmark, and extensive experiments substantiate that our Agent-Driver significantly outperforms the state-of-the-art driving methods by a large margin. Our approach also demonstrates superior interpretability and few-shot learning ability to these methods.

![Alt text](assets/method.png)

## Installation
a. Clone this repository.

b. Install the dependent libraries as follows:

```
pip install -r requirements.txt 
```

## Data Preparation

a. We used pre-cached data from the nuScenes dataset. The data can be downloaded at [Google Drive](https://drive.google.com/drive/folders/1BjCYr0xLTkLDN9DrloGYlerZQC1EiPie?usp=sharing).

b. You can put the downloaded data here:
```
Agent-Driver
├── data
│   ├── finetune
|   |   |── data_samples_train.json
|   |   |── data_samples_val.json
│   ├── memory
|   |   |── database.pkl
│   ├── metrics
|   |   |── gt_traj.pkl
|   |   |── gt_traj_mask.pkl
|   |   |── stp3_gt_seg.pkl
|   |   |── uniad_gt_seg.pkl
│   ├── train
|   |   |── [token].pkl
|   |   |── ...
│   ├── val
|   |   |── [token].pkl
|   |   |── ...
│   ├── split.json
├── agentdriver
├── scripts
```

## Training

a. Before we start, we need to fine-tune a GPT-based motion planner (as in the reasoning engine). To do so, you first need to register an [OpenAI API account](https://platform.openai.com/).

b. After registration, you can generate your API-key and your oganization key in your account settings. Here is an example:

```
openai.api_key = "sk-**"
openai.organization = "org-**"
```

c. You need to specify your own keys in the `agentdriver/llm_core/api_keys.py`, and this will be used in running Agent-Driver.

Please note that this is your own key and will be linked to your bill payment, so keep this confidential and do not distribute it to others!

d. For fine-tuning a motion planner, simply run
```
sh scripts/run_finetune.sh
```
will automatically collect data and send finetuning jobs to OpenAI. More details can be found in `agentdriver/execution/fine_tune.py`.

**Note:** Fine-tuning costs money. Please refer to the [pricing page](https://openai.com/pricing). To save your money, by default we use 10% of the full training data for fine-tuning, from which you are supposed to get decent results with less than 10$ usd. You can get better results by using 100% data, and in this setting you may want to specify `sample_ratio=1.0` in `agentdriver/execution/fine_tune.py`.

d. When your fine-tune job successfully completes, you will receive an email notifying your fine-tuned GPT model id, like this
```
ft:gpt-3.5-turbo-0613:**::**
```
This model id denotes your own GPT-based motion planner. You need to specify this model id in `FINETUNE_PLANNER_NAME` of `agentdriver/llm_core/api_keys.py`.  

## Inference

a. Once all keys in `agentdriver/llm_core/api_keys.py` have been set up correctly, you can inference the whole Agent-Driver pipeline in `agentdriver/unit_test/test_lanuage_agent.ipynb`.

b. You can also find the usage of individual components (tool library, cognitive memory, reasoning engine) in the folder `agentdriver/unit_test`.

## Evaluation

a. If you want to evaluate the planning performance on nuScenes validation set, you can first collect the motion planning results by running
```
sh scripts/run_inference.sh
```
You will get a `pred_trajs_dict.pkl` in the `experiments` folder. 

b. For evaluation, you can run
```
sh scripts/run_evaluation.sh uniad YOUR_PRED_DICT_PATH
```
with your `pred_trajs_dict.pkl` file location.