Keywords: Spatio-temporal point processes, free-form text generation, semantic event forecasting
TL;DR: A Language-Augmented Spatio-Temporal Point Processes jointly forecast when, where, and what the next event will be, leveraging rich semantic knowledge for event forecasting with free-form text generation, a step towards language-based world model.
Abstract: Spatio-temporal point processes (STPPs) provide a principled framework for forecasting discrete events in continuous time and space, yet most existing models represent additional event information as fixed categorical marks. This abstraction is increasingly restrictive for modern event streams, where events are often accompanied by rich free-form textual descriptions and reducing them into categorical labels can lose valuable information about the underlying event dynamics. Large language models emerge as powerful tools for handling textual context. However, they still lack principled mechanisms for modeling complex spatio-temporal dynamics. We introduce language-augmented spatio-temporal point process (LA-STPP), a framework that leverages rich texts in the past events for future event prediction. More importantly, our framework enables free-form text generation as part of the event forecasting. LA-STPP couples an STPP-based forecaster with a fine-tuned language model via a shared history representation that encodes past event times, locations, and textual content. By conditioning language generation on spatio-temporal dynamics, LA-STPP predicts not only the timing and location of future events but also their semantic content. Experiments show that LA-STPP largely improves text prediction quality over text-only baselines while preserving superior spatio-temporal forecasting capability, suggesting a path toward language-capable world models that forecast when, where, and what will happen next.
Submission Number: 123
Loading