Leveraging Textual Description and Structured Data for Estimating Crash Risks of Traffic Violation: A Multimodal Learning Approach
Abstract: This study introduces a novel methodology that integrates both structured data and unstructured violation descriptions, addressing a critical gap in current crash risk estimation techniques. By combining these two data types, our approach captures a more comprehensive picture of violation characteristics, enabling more precise identification of crash-prone violations. We propose an innovative framework that leverages advanced Natural Language Processing (NLP) techniques alongside state-of-the-art Large Language Models (LLMs) to convert raw textual information into actionable features, thereby equipping law enforcement with a powerful tool to systematically assess crash risk and prioritize high-risk violations for targeted interventions. Specifically, we compare four NLP algorithms—Jaccard, TF-IDF, fastText, and BERT—with an LLM framework (GPT-3.5) to process violation descriptions. Following text conversion, we implement eight data-driven classification models, ranging from linear and tree-based to neural network approaches, to predict crash-prone violations. Our experimental results reveal that BERT and GPT-3.5 significantly improve classification accuracy and recall by extracting meaningful insights from text, whereas the other NLP methods may introduce noise. Among the classification models, TabNet outperforms others with an accuracy of 0.9717, recall of 0.8861, and AUC of 0.9658.
External IDs:doi:10.1109/tits.2025.3568287
Loading