Multi-Level Spatial Embedding Sharing for Enhanced Online Trajectory-User Linking

Multi-Level Spatial Embedding Sharing for Enhanced Online Trajectory-User Linking

TMLR Paper8765 Authors

04 May 2026 (modified: 15 May 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Trajectory-User Linking (TUL) is a critical task in mobility applications that links unlabeled spatial trajectories to the users or entities that generated them. In these applications, data often arrives as a continuous stream and may experience distributional shifts over time. While adapting TUL models via online learning could address these challenges, this approach remains unexplored in current research. Our work bridges this gap by conducting comprehensive evaluations of common TUL techniques in an online learning context. To improve the performance of existing TUL techniques in this setting, we propose Multi-Level Spatial Embedding Sharing (MiLES), an embedding approach that adapts and extends the principle of multi-scale spatial sharing for online TUL. MiLES partially shares embeddings across neighborhoods of multiple size levels, enabling generalization within neighborhoods while maintaining fine-grained discrimination through more location-specific representations. MiLES also significantly reduces the number of embedding parameters, leading to lower memory usage and more computationally efficient model updates. We further incorporate learnable weighting parameters for each embedding level, allowing the model to learn the influence of different levels during training. Our experimental results on several real-world datasets show that integrating MiLES into state-of-the-art TUL models significantly improves their performance in online learning scenarios, yielding relative gains in top-1 accuracy of up to 24\%, with consistent improvements observed across other training paradigms as well. However, the online gains are particularly relevant, as our findings suggest that online learning is the most suitable paradigm real-time TUL on streaming data, outperforming periodic batch retraining at substantially lower computational cost. To demonstrate its general applicability, we also evaluate MiLES on the task of destination prediction, where it provides consistent performance improvements, confirming its value as a domain-general embedding technique. Our code is available at \url{https://anonymous.4open.science/r/MiLES-3D20}.

Submission Type: Long submission (more than 12 pages of main content)

Previous TMLR Submission Url: https://openreview.net/forum?id=LGflWbxAuP

Changes Since Last Submission: This submission has undergone a revision based on feedback from a previous review process. The most significant addition is a new experiment comparing online learning against periodic batch retraining. A full list of changes follows. ## New Experiments & Analyses - Comparison of online learning against periodic batch retraining with intervals of 2,500 and 5,000 trajectories across all three datasets (see Pages 14 & 15). Online learning consistently outperforms retraining while requiring approximately 120x fewer forward and backward passes. - Rolling top-1 accuracy analysis over the data stream for Foursquare-NYC and GeoLife with confidence bands, showing consistent stream-wise advantage of MiLES (Figure 2). - Batch vs. online comparison with separately tuned hyperparameters, confirming that MiLES's gains are consistently larger in the online regime (see Page 15). ## Revised Framing and Positioning - Revised introduction and related work to explicitly acknowledge the lineage of grid-based and multi-scale spatial encoding methods. Added discussion of Space2Vec (Mai et al., 2020) and Fourier features (Tancik et al., 2020). MiLES is now positioned as adapting and extending multi-scale spatial sharing for online TUL rather than introducing a fully new modeling principle. - Narrowed concept drift claims throughout. The conclusion now separates the well-supported data-scarcity finding from the more tentative drift robustness finding. - Expanded appendix with two new descriptive analyses: KL divergence of POI distributions on Foursquare-NYC and label entropy on GeoLife. ## Clarifications and Expanded Discussion - Added explicit justification of the hyperparameter tuning protocol as a deliberate design choice reflecting online deployment constraints. - Added discussion of pre-allocation vs. dynamic expansion in the approach section and limitations. - Expanded limitations section to explain when MiLES is and is not beneficial, and to discuss alternative retraining strategies as future work. - Added end-to-end pipeline description in the method section. - Replaced all references to "attention mechanism" with "learnable level weighting" throughout.

Assigned Action Editor: ~Pablo_Samuel_Castro1

Submission Number: 8765

Loading