FACTS: A Future-Aided Causal Teacher-Student Framework for Multimodal Time Series Forecasting

ICLR 2026 Conference Submission22522 Authors

20 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multimodal time series forecasting, knowledge distillation, causal learning
TL;DR: A future-aided causal teacher-student framework for multimodal time series forecasting
Abstract: Traditional \emph{unimodal} time series forecasting models often perform unreliably in real-world applications because they fail to capture the underlying causal drivers of temporal change. Fortunately, auxiliary modalities can unveil these drivers, \textit{e.g.}, sky images capture the illumination conditions that govern solar power generation. However, the most informative \emph{future} auxiliary signals directly tied to the target time series are unavailable at inference, while integrating such data is further hindered by cross-modal heterogeneity and structural mismatch. To address these challenges, we propose FACTS, a Future-Aided Causal Teacher-Student framework for \emph{multimodal} time series forecasting. The teacher network, used only during training, leverages future auxiliary data to disentangle the causal responses underlying temporal dynamics, while the student network, trained solely on historical data, learns such causal knowledge via our proposed causal-perturbation contrastive distillation. To accommodate heterogeneous inputs, we design a bilinear orthogonal projector that efficiently converts high-dimensional auxiliary data into a compact series over time, allowing us to model both auxiliary data and time series via a unified bidirectional attention backbone. Furthermore, we devise a lag-aware fusion to align cross-modal signals within a tolerance window and apply random modality dropout to enhance student's robustness to modality missingness. Extensive experiments on benchmark datasets demonstrate that FACTS significantly outperforms state-of-the-art methods, achieving average improvements of 32.98\% in MSE and 22.25\% in MAE. Code is available at \url{https://github.com/anonymous202402/FACTS}.
Primary Area: learning on time series and dynamical systems
Submission Number: 22522
Loading