Deep Learning for Urban Planning and Location-Based Services

Published: 22 Sept 2025, Last Modified: 22 Sept 2025WiML @ NeurIPS 2025EveryoneRevisionsBibTeXCC BY 4.0
Keywords: machine learning, transformer, application, human mobility
Abstract: Accurately attributing user visits to Points of Interest (POIs) is a cornerstone of human mobility analytics, aiding applications in personalized services, marketing and downstream geo-spatial tasks such as next-location prediction and anomaly detection [1]. POI attribution maps raw GPS trajectories to semantically meaningful places [7], adding interpretability—e.g., identifying a coffee shop visit at 3 pm is far more useful than recording coordinates < latitude, longitude > at time t. Yet attribution is difficult: GPS errors (2–20 meters) and dense urban clustering of POIs (often 50+ within 100 meters), render proximity-based heuristics unreliable. Accurate attribution, however, yields fine-grained behavioral insights (e.g., which store in a strip mall was visited), enabling more precise applications, from urban planning [6] to public health, such as identifying potential pandemic hotspots [2]. Conversely, misattributions risk contaminating downstream models, leading them to learn misleading or spurious patterns. Despite this complexity, attribution is often reduced to a simple heuristic: assigning each stay to the nearest POI [4]. While straightforward, this approach overlooks key real-world challenges, including GPS noise, dense urban settings where multiple POIs fall within error bounds, and contextual signals such as visit duration or time of day. More sophisticated methods [5] can improve accuracy by leveraging detailed spatial features like building footprints and hierarchical metadata, but such information is not universally available. Instead, we propose POIFormer, a novel Transformer-based framework for POI attribution that jointly models a diverse set of signals, including spatial proximity, temporal features of the visit (arrival/departure and dwell time), POI semantics, user-specific mobility patterns, and population-level historical trends. A key innovation of POIFormer is its explicit incorporation of two dimensions of behavioral context: one capturing individual preferences, and another capturing crowd-level visit patterns. Individual preferences are modeled using a transformer that considers both past and future visits, with the location of the current (target) visit masked. This context enables the transformer to evaluate which nearby POI candidate is most likely given a user’s past and future visits, based on the time of day and duration of the stay of the target visit. Crowd-level historical visit patterns are modeled using the temporal popularity distributions of POIs, estimated via Kernel Density Estimation (KDE). These KDE models capture the joint distribution of location and time (e.g., hour of day) for visits within each POI category. This enables POIFormer to probabilistically downweight unlikely POIs–for example, reducing the likelihood of assigning a late-night visit to a coffee shop if historical data shows it is rarely visited at that hour. These KDEs are pre-computed per category facilitating efficient, scalable inference without sacrificing accuracy since they retain the full joint distribution of location and time while avoiding the need for computation at time of inference. Finally, POIFormer combines individual and crowd-level scores into a unified likelihood measure, selecting the most probable POI (or set of POI) among nearby candidates. Furthermore, unlike prior approaches [3, 5], POIFormer makes no restrictive assumptions about POI categories, and does not rely on detailed spatial data layers about POIs, thereby enhancing its applicability across diverse geographic and data-constrained contexts. Extensive experimental evaluation on publicly available datasets, one simulated and one derived from real-world mobility traces, demonstrate that POIFormer consistently outperforms existing baselines including the current state-of-the-art technique proposed by SafeGraph [5] by a substantial margin, particularly in top-3 and top-5 accuracy.
Submission Number: 289
Loading