LLM4ST-Traffic: Leveraging Large Language Models for Cross-modal Knowledge Transfer to Overcome Data Sparsity in Traffic Prediction
Abstract: Traffic prediction is a core challenge of Intelligent Transportation Systems (ITS). The development of deep learning has driven significant advancements in traffic prediction models; however, the increased complexity of these models has led to higher demands for data scale. Existing models have encountered performance bottlenecks due to an imbalance between excessive complexity and data sparsity. This paper proposes LLM4ST-Traffic, a traffic prediction framework based on Large Language Models (LLMs), aimed at addressing data scarcity through cross-modal semantic alignment and lightweight fine-tuning. The core innovations include: the Cross-Modal Alignment (CMA) module, which utilizes cross-attention to establish deep connections between traffic features (such as flow trends and periodicity) and textual concepts, thereby overcoming the semantic disjunction caused by traditional linear projections, and Prefix Adapter Fine-Tuning (PAFT), which implements learnable prefix prompts for lightweight training, optimizing predictive performance while retaining pre-trained knowledge. Experimental results indicate that LLM4ST-Traffic demonstrates exceptional performance in prediction accuracy and robustness, exhibiting outstanding performance in low-sample scenarios. Interpretability analysis validates the effectiveness of semantic alignment.
Paper Type: Long
Research Area: Multimodality and Language Grounding to Vision, Robotics and Beyond
Research Area Keywords: cross-modal application, cross-modal information extraction, data-efficient training,
Contribution Types: Model analysis & interpretability
Languages Studied: English
Submission Number: 5269
Loading