# RoadSight: Intersection and Roundabout Detection

YOLO-based computer vision system for detecting intersections and roundabouts in aerial imagery.

## Overview

- **Task**: Object detection of road infrastructure in aerial images
- **Classes**: 2 (Intersection, Roundabout)
- **Model**: YOLO11s optimized for road features
- **Performance**: 68.4% mAP@0.5-0.95, 138 FPS on RTX 2080



## Dataset

🚨 **CRITICAL NOTICE** 🚨
**INCOMPLETE TRAINING DATA**: Due to zip file size limitations for publication submission (100MB max), **618 out of 679 training images (91.1%) have been removed** from `data/train/images/`. Only **61 training images remain** in this supplementary material. All validation and test images are complete.

- **Total Images**: 969 (960×640 pixels)
- **Annotations**: 1,355 instances across 907 images
- **Train/Val/Test**: 679/145/146 images (70%/15%/15%) - **Only 61/679 training images included**
- **Classes**: Intersection (770), Roundabout (585)

## Project Structure

```
roadsight/
├── train.py                     # Training script
├── val.py                       # Validation script  
├── bench.py                     # Benchmarking script
├── benchmarks_jetson.log        # Performance results
├── data/                        # Dataset
│   ├── data.yaml               # YOLO configuration
│   ├── dataset_analysis_report.txt
│   ├── train/val/test/         # Data splits
└── run/                        # Training outputs
```

## Quick Start

### Installation
```bash
pip install ultralytics wandb torch opencv-python
```

### Training
```bash
python train.py  # Trains YOLO11s with optimized parameters
```

### Validation
```bash
python val.py    # Evaluates model performance
```

### Benchmarking
```bash
python bench.py  # Tests export formats and speed
```

## Training Configuration

Key parameters optimized for road detection:
- **Batch size**: 16
- **Image size**: 640×640
- **Epochs**: 100 (early stopping at 15)
- **Class weights**: [1.32, 1.0] for roundabout/intersection balance
- **Augmentation**: Reduced rotation (10°) suitable for road orientation

## Model Details

- **Architecture**: YOLO11s
- **Input**: 640×640 RGB images
- **Output**: Bounding boxes with class predictions
- **Optimization**: Mixed precision, GPU memory management

## Results

Best model located at: `run/train/yolo11s-16/weights/best.pt`

**Validation metrics on RTX 2080ti:**

| Model | Batch | mAP50 | mAP50-95 | Precision | Recall | F1 | Inference(ms) |
|-------|-------|-------|----------|-----------|--------|----|--------------| 
| YOLOv8n | 16 | 0.969 | 0.635 | 0.978 | 0.956 | 0.967 | 4.653 |
| YOLOv8s | 16 | 0.974 | 0.661 | 0.995 | 0.941 | 0.968 | 4.885 |
| YOLOv8m | 16 | 0.963 | 0.657 | 0.984 | 0.921 | 0.951 | 7.366 |
| YOLOv8l | 16 | 0.965 | 0.629 | 0.978 | 0.939 | 0.958 | 12.707 |
| YOLOv8x | 8 | 0.966 | 0.636 | 0.992 | 0.926 | 0.958 | 18.371 |
| YOLOv11n | 16 | 0.975 | 0.637 | 0.979 | 0.940 | 0.959 | 6.870 |
| **YOLOv11s** | **16** | **0.966** | **0.676** | **0.958** | **0.937** | **0.947** | **6.668** |
| YOLOv11m | 16 | 0.949 | 0.610 | 0.970 | 0.882 | 0.924 | 8.707 |
| YOLOv11l | 16 | 0.968 | 0.640 | 0.982 | 0.922 | 0.951 | 12.192 |
| YOLOv11x | 8 | 0.957 | 0.608 | 0.993 | 0.901 | 0.945 | 18.507 |
| YOLOv12n | 16 | 0.970 | 0.648 | 0.980 | 0.932 | 0.955 | 10.304 |
| YOLOv12s | 16 | 0.961 | 0.649 | 0.967 | 0.949 | 0.958 | 10.461 |
| YOLOv12m | 8 | 0.968 | 0.614 | 0.953 | 0.947 | 0.950 | 11.169 |

**Benchmark on Jetson Orin Nano:**

| Format | Size (MB) | F1 | mAP50 | mAP50-95 | Inference (ms) |
|--------|-----------|----|---------|---------|--------------| 
| PyTorch (FP32) | 18.3 | 0.963 | 0.971 | 0.684 | 32.6 |
| PyTorch (FP16) | 18.3 | 0.963 | 0.971 | 0.686 | 31.9 |
| PyTorch (INT8) | 18.3 | 0.963 | 0.971 | 0.684 | 33.9 |
| TorchScript (FP32) | 36.4 | 0.957 | 0.973 | 0.665 | 37.8 |
| TorchScript (FP16) | 36.4 | 0.957 | 0.973 | 0.664 | 26.9 |
| ONNX (FP32) | 36.2 | 0.957 | 0.973 | 0.665 | 477.5 |
| ONNX (FP16) | 18.1 | 0.957 | 0.974 | 0.666 | 342.5 |
| TensorRT (FP32) | 38.2 | 0.957 | 0.973 | 0.665 | 24.1 |
| **TensorRT (FP16)** | **21.7** | **0.957** | **0.973** | **0.664** | **14.5** |
| TensorRT (INT8) | 12.2 | 0.956 | 0.975 | 0.653 | 10.9 |
| TensorFlow (FP32) | 36.2 | 0.957 | 0.973 | 0.665 | 368.8 |
| MNN (FP32) | 36.1 | 0.958 | 0.973 | 0.668 | 215.9 |
| MNN (FP16) | 18.1 | 0.958 | 0.973 | 0.667 | 219.4 |
| MNN (INT8) | 9.3 | 0.961 | 0.974 | 0.663 | 212.1 |
| NCNN (FP32) | 36.1 | 0.957 | 0.973 | 0.667 | 250.9 |
| NCNN (FP16) | 18.2 | 0.958 | 0.973 | 0.666 | 326.5 |



---

*ROADSIGHT: A novel dataset for real-time intersection detection in aerial scenes under seasonal variation*