VISION-RISK: Vision-Language Model for Risk Assessment in Autonomous Driving

Andrei-Bogdan Constantin, Vlad-Andrei Negru, Camelia Lemnaru, Rodica Potolea

Published: 2025, Last Modified: 26 May 2026ICCP 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Autonomous driving involves the challenge of assessing and managing risks in complex environments. One approach for addressing this is the use of human-readable explanations of risk-related tasks to support justification of driving behavior. In this paper, we introduce VISION-RISK, a vision-language model (VLM) designed for risk assessment in autonomous driving using a lightweight architecture, optimized for deployment on edge devices. To train the model, we developed a custom dataset combining real-world driving scenarios from Honda Driving Dataset and extreme high-risk cases from Car Crash Dataset, augmented with synthetic annotations using Dolphins and refined via DeepSeek-V3. VISION-RISK stands out through three key characteristics: the integration of danger level classification with natural language explanation generation, a lightweight architecture optimized for deployment on resource-constrained devices, and a focus on safety through risk assessment to support trust in autonomous driving.
Loading