Discovering Latent Causal Graphs from Spatiotemporal Data

Published: 01 May 2025, Last Modified: 18 Jun 2025ICML 2025 posterEveryoneRevisionsBibTeXCC BY 4.0
TL;DR: We describe a framework to perform spatiotemporal causal discovery and prove the identifiability of this model.
Abstract: Many important phenomena in scientific fields like climate, neuroscience, and epidemiology are naturally represented as spatiotemporal gridded data with complex interactions. Inferring causal relationships from these data is a challenging problem compounded by the high dimensionality of such data and the correlations between spatially proximate points. We present SPACY (SPAtiotemporal Causal discoverY), a novel framework based on variational inference, designed to model latent time series and their causal relationships from spatiotemporal data. SPACY alleviates the high-dimensional challenge by discovering causal structures in the latent space. To aggregate spatially proximate, correlated grid points, we use spatial factors, parametrized by spatial kernel functions, to map observational time series to latent representations. Theoretically, we generalize the problem to a continuous spatial domain and establish identifiability when the observations arise from a nonlinear, invertible function of the product of latent series and spatial factors. Using this approach, we avoid assumptions that are often unverifiable, including those about instantaneous effects or sufficient variability. Empirically, SPACY outperforms state-of-the-art baselines on synthetic data, even in challenging settings where existing methods struggle, while remaining scalable for large grids. SPACY also identifies key known phenomena from real-world climate data. An implementation of SPACY is available at \url{https://github.com/Rose-STL-Lab/SPACY/}
Lay Summary: Many important problems — like tracking weather, brain activity, or disease outbreaks — involve data that change across both space and time, with nearby locations often affecting each other. Figuring out how changes in one area lead to changes in another is important but difficult, because there are so many data points and nearby locations often carry similar information. We developed SPACY, a tool that first compresses the large grid of data into a smaller set of time series, then uncovers how these series influence each other. By grouping nearby locations, SPACY reduces redundant information and focuses on meaningful connections. In tests with simulated data, SPACY accurately finds true cause-and-effect relationships even when other methods struggle. We also validate SPACY by applying it to climate data, where it successfully identifies known climate patterns. Because it works with compressed data, SPACY can handle large datasets, making it useful for studying complex systems in climate science, neuroscience, and public health. The code is freely available for others to use.
Link To Code: https://github.com/Rose-STL-Lab/SPACY/
Primary Area: General Machine Learning->Causality
Keywords: Representation Learning, Causal Discovery, Spatiotemporal Inference, Variational Inference, Time Series Analysis
Submission Number: 9404
Loading