PLMTrajRec: A Scalable and Generalizable Trajectory Recovery Method with Pre-trained Language Models
Keywords: Spatio-temporal data mining, trajectory data mining, vehicle trajectory recovery
TL;DR: We propose a trajectory recovery model based on PLM to address the challenge of the availability of dense trajectories is limited and to generalize across sparse trajectories with varying sampling intervals.
Abstract: Spatiotemporal trajectory data is crucial for various traffic-related applications. However, issues such as device malfunctions and network instability often result in sparse trajectories that lose detailed movement information compared to their dense counterparts. Recovering missing points in sparse trajectories is thus essential. Despite recent progress, three challenges remain. First, the lack of large-scale dense trajectory datasets hinders the training of a trajectory recovery model. Second, the varying spatiotemporal correlations in sparse trajectories make it hard to generalize across different sampling intervals. Third, extracting road conditions for missing points is non-trivial.
To address these challenges, we propose $\textit{PLMTrajRec}$, a novel trajectory recovery model. It leverages the scalability of a pre-trained language model (PLM) and can effectively recover trajectories by fine-tuning with small-scale dense trajectory datasets. To handle different sampling intervals in sparse trajectories, we first convert sampling intervals and movement features into prompts for the PLM to understand. We then introduce a trajectory encoder to unify trajectories of varying intervals into a single interval. To extract road conditions for missing points, we propose an area flow-guided implicit trajectory prompt that represents traffic conditions in each region, and a road condition passing mechanism that infers the road conditions of missing points using the observed ones. Experiments on four public trajectory datasets with three sampling intervals demonstrate the effectiveness, scalability, and generalization ability of PLMTrajRec. Code is available at https://github.com/wtl52656/PLMTrajRec.
Supplementary Material: zip
Primary Area: Machine learning for sciences (e.g. climate, health, life sciences, physics, social sciences)
Submission Number: 3318
Loading