Abstract: Spatiotemporal data collected by sensors within an urban Internet of Things (IoT) system inevitably contains some missing values, which significantly affects the accuracy of spatiotemporal data forecasting. However, existing techniques, including those based on large language models (LLMs), show limited effectiveness in forecasting with missing values, especially in scenarios involving high-dimensional sensor data. In this article, we propose a novel spatiotemporal pretrained LLM dubbed SPLLM for forecasting with missing values. In this network, we seamlessly integrate a specialized spatiotemporal fusion graph convolutional network (GCN) module that extracts intricate spatiotemporal and graph-based information, for generating suitable inputs to the SPLLM. Furthermore, we propose a feed-forward network (FFN) fine-tuning strategy within the LLM and a final fusion layer to enable the model to leverage the pretrained foundational knowledge of the LLM and adapt to new incomplete data simultaneously. The experimental results indicate that SPLLM outperforms state-of-the-art models on real-world public datasets. Notably, SPLLM exhibits a superior performance in tackling incomplete sensory data with a variety of missing rates. A comprehensive ablation study of key components is conducted to demonstrate their efficiency.
External IDs:dblp:journals/iotj/FangXPSC25
Loading