Abstract: Autonomous driving involves the challenge of assessing and managing risks in complex environments. One approach for addressing this is the use of human-readable explanations of risk-related tasks to support justification of driving behavior. In this paper, we introduce VISION-RISK, a vision-language model (VLM) designed for risk assessment in autonomous driving using a lightweight architecture, optimized for deployment on edge devices. To train the model, we developed a custom dataset combining real-world driving scenarios from Honda Driving Dataset and extreme high-risk cases from Car Crash Dataset, augmented with synthetic annotations using Dolphins and refined via DeepSeek-V3. VISION-RISK stands out through three key characteristics: the integration of danger level classification with natural language explanation generation, a lightweight architecture optimized for deployment on resource-constrained devices, and a focus on safety through risk assessment to support trust in autonomous driving.
External IDs:dblp:conf/iccp2/ConstantinNLP25
Loading