Causal Probabilistic Spatio-temporal Fusion Transformers in Two-sided Ride-Hailing MarketsDownload PDF

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone
Keywords: Spatio-temporal Prediction, Causal Inference, Efficient Transformers, Two-sided Markets
Abstract: Achieving accurate spatio-temporal predictions in large-scale systems is extremely valuable in many real-world applications, such as weather forecasts, retail forecasting, and urban traffic forecasting. So far, most existing methods for multi-horizon, multi-task and multi-target predictions select important predicting variables via their correlations with responses, and thus it is highly possible that many forecasting models generated from those methods are not causal, leading to poor interpretability. The aim of this paper is to develop a collaborative causal spatio-temporal fusion transformer, named CausalTrans, to establish the collaborative causal effects of predictors on multiple forecasting targets, such as supply and demand in ride-sharing platforms. Specifically, we integrate the causal attention with the Conditional Average Treatment Effect (CATE) estimation method for causal inference. Moreover, we propose a novel and fast multi-head attention evolved from Taylor expansion instead of softmax, reducing time complexity from $O(\mathcal{V}^2)$ to $O(\mathcal{V})$, where $\mathcal{V}$ is the number of nodes in a graph. We further design a spatial graph fusion mechanism to significantly reduce the parameters' scale. We conduct a wide range of experiments to demonstrate the interpretability of causal attention, the effectiveness of various model components, and the time efficiency of our CausalTrans. As shown in these experiments, our CausalTrans framework can achieve up to 15$\%$ error reduction compared with various baseline methods.
One-sentence Summary: We develop a novel causal transformer with causal inference and efficient taylor attention to address large scale spatio-temporal predictions. Our method achieves up to 15% error reduction compared with various baseline methods.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Reviewed Version (pdf): https://openreview.net/references/pdf?id=i9kbk52UWw
16 Replies

Loading