For object detection and vehicle tracking from real-time video feed , sumo installation and configuration ,
we used the following two repositories:

https://github.com/reetipd/SumoFiles

https://github.com/shebywiliamsjr/TrafficDT

All credit for real-time data collection goes to: reetipd (Reeti Pradhananga)

We consolidated all the output folders into a single parent directory named "New folder" to organize our custom dataset and facilitate LLM performance evaluation.
While doing so, we made minor modifications to files such as .net.xml, edge.xml, type.xml, and other network-related XML files — excluding route.rou.xml.
These updated network definitions were stored in the new_road_network folder.

Sudden Burst of Vehicles & Traffic Signal Malfunction

Sudden Burst:

We used 1-minute and 2-minute data intervals.

After changing training data (new day ), route and network XML files were refreshed to maintain a clean environment.

For dataset creation, refer to: for_sudden_burst_training_data.ipynb

We trained on 12 routes.

For evaluation, refer to: for_sudden_burst_evaluation.ipynb

The model was trained on data from 6 different days and evaluated on 2 separate days.

Traffic Signal Malfunction:

We utilized 2-minute to 6-minute data intervals.

Two faulty traffic signal logics were designed:

One to starve the North–South approach

Another to starve the East–West approach

The traffic light logic was modified in the .net.xml file.

All other supporting XML files (except route.rou.xml) were saved in the new_road_network folder.

For dataset creation, refer to: traffic_malfunction_training_data.ipynb
(Note: After changing training data, route and network XMLs were refreshed each time.)

For evaluation, refer to: traffic_malfunction_evaluation.ipynb
The model was trained on data from 6 different days and evaluated on 2 separate days for this as well .

Training Days:

NE12_50

Bellevue_150th_Newport__2017-09-11_17-08-32

Bellevue_150th_Newport__2017-09-11_08-08-31

Bellevue_150th_Newport__2017-09-10_18-08-24

Bellevue_116th_NE12th_2017-09-10_19-08-25

150_NE

location : "C:\Users\tasfi\Downloads\New folder\Finish_by_today\Finish_by_today\Finish_by_today"

Testing Days:

Bellevue_116th_NE12th__2017-09-11_09-08-31

Bellevue_116th_NE12th__2017-09-11_14-08-35

location : "C:\Users\tasfi\Downloads\New folder\Finish_by_today\Finish_by_today"

Stalled Vehicle Scenario

We used 2-minute to 6-minute data intervals.

For evaluation, refer to: broken_car_situation_for_evaluation.ipynb

Performance was evaluated across all 8 available days of data.






Unified Testing for All Scenarios

To test all three scenarios in a streamlined manner:

Each scenario (stalled vehicle, signal malfunction, sudden burst) was added separately.

After each run, route and network XMLs were refreshed to maintain a clean environment.

A detection script automatically identifies the scenario and prints a descriptive message.

Optimized logic is generated by running test.py from "C:\Users\tasfi\Downloads\New folder\messed_up" folder using the printed message as the prompt.

for this refer to testing_all_3_scenarios.ipynb




LLM Model Directory Structure
The messed_up folder serves as the main working directory for finetuning and testing the LLM model. It contains:

The training dataset (merged_file3_updated.jsonl)

The output directory for saving model weights and checkpoints (llama3_8b_run1)

Evaluation scripts and some test samples



all the codes and model's weights are available in : https://anonymous.4open.science/r/Generative-Traffic-Simulations-15BB










