# A Framework for Cooperative LLM Agents in Traffic Signal Control

<a id="Overview"></a>
## 1. Overview

This repository provides the official implementation for CoLLMLight: Cooperative Large Language Model Agents for Network-Wide Traffic Signal Control

Network-wide optimization in traffic signal control (TSC) requires agents to cooperate across intersections. However, recent Large Language Model (LLM)-based TSC agents are designed as independent agents without inter-intersection cooperation, which limits their effectiveness at the network level. To address this gap, we propose CoLLMLight, a cooperative LLM agent framework for network-wide traffic signal control. CoLLMLight introduces a spatiotemporal-aware cooperative reasoning module to analyze interactions with neighboring agents and produce cooperative suggestions. This reasoning process is implemented within an asynchronous decision architecture to support multi-step reasoning without compromising real-time responsiveness. To further improve both cooperation effectiveness and reasoning efficiency, we propose a cost-aware cooperation optimization strategy. It first applies adaptive reasoning optimization to equip the LLM with the capability to generate concise yet effective cooperative reasoning across varied traffic conditions. Then, it refines the policy using reward signals that encourage both effective decision-making and efficient reasoning. Extensive experiments on four real-world traffic networks demonstrate that CoLLMLight significantly outperforms existing methods by enabling more effective and generalizable cooperation, while ensuring low decision latency and efficient token usage.

![Agent Framework Overview](./media/Overview.png)

<a id="requirements"></a>
## 2. Requirements

- `python>=3.9`
- `cityflow` (Requires a Linux environment; tested on Ubuntu)
- `transformers==4.48.2`
- `vllm`
- `lmdeploy`
- `torch==2.2.2`

You can install the required Python packages using:
```bash
pip install -r requirements.txt
```

<a id="Usage"></a>
## 3. Quick Start

To run inference with a trained model, follow these steps.

**Step 1: Deploy the LLM Inference Server**

Deploy your chosen large language model using a service like `vllm` or `lmdeploy`.

```bash
# Example using lmdeploy
lmdeploy serve api_server /path/to/your/llm --tp <num_gpus>
```

**Step 2: Run the Simulation**

Execute the main script to run the agent in the CityFlow simulation environment.

```bash
python run_CoLLMlight.py \
    --model_path /path/to/your/llm \
    --dataset 'newyork_28x7' \
    --traffic_file 'anon_28_7_newyork_real_double.json'
```

<a id="Training"></a>
## 4. Training Workflow

The training process consists of three main stages.

### Stage 1: Simulation Data Sampling

First, sample simulation data from the CityFlow environment. This data will serve as the basis for the subsequent training steps.

```bash
python run_fts.py
```

The sampled data will be saved to `./data/FinetuneData/SynTrain_sample.json`.

### Stage 2: Adaptive Reasoning Chain Generation

Next, generate synthetic reasoning chains using a powerful teacher model (e.g., GPT-4o) and the data sampled in the previous step.

**1. Configure API Key:**
Set your OpenAI API key in the `./utils/LLMs.py` file.

**2. Generate Data:**
Run the generation script.

```bash
python reasoning_tuning_data_synth.py
```

The output, saved in `./data/FinetuneData/syn_rt_data.json`, contains the reasoning data for fine-tuning a base LLM. You can use standard fine-tuning libraries like a LLaMA Factory for this purpose.

### Stage 3: Policy Refinement

Finally, use the fine-tuned LLM from Stage 2 to perform policy refinement via Proximal Policy Optimization (PPO).

**1. Configure Model Path:**
In `config/ppo_config.yaml`, set the `model_name` parameter to the path of your fine-tuned LLM from Stage 2.

**2. Run PPO Training:**
Execute the PPO training script.

```bash
python ppo_ft.py
```
