Abstract: An important task in transportation studies is to accurately map a given set of trajectories representing moving individuals onto specific means of transportation, like trains or buses. In this paper, we consider the following problem: given a trajectory representing train stations visited by a passenger during a trip and a train schedule, extract the set of trains that have been taken by the individual during the trip. Specifically, we introduce a novel algorithm based on a Generalized Suffix Tree (GST) to efficiently link passenger trajectories to train schedules, addressing challenges like large data volumes and noisy input trajectories. Our method constructs a GST from train schedules and allows for integrating multiple schedules into a single searchable structure, enabling rapid and precise matching of trajectories to train routes. Although we use trains as an example, the approach can be used for other means like buses or trams. To analyze our solution, we construct a synthetic dataset of passenger trajectories built over the Italian train schedule; the dataset contains trajectories with and without transfers and with noise both in timestamps and station identifiers. The experimental analysis shows that our solution perfectly reconstructs noiseless trajectories even with transfers, and achieves an accuracy of at least 86% with noisy data.
Loading