Keywords: time series forecasting, gating, transformer, mixture of experts, multi experts, multi modal
TL;DR: Multi modal time series forecasting with a gating architecture for fusing the learning of multiple experts.
Abstract: Forecasting future trends in complex domains often requires leveraging diverse data sources beyond traditional numerical time series. However, integrating heterogeneous data types into a unified forecasting framework remains an underexplored challenge. Existing multi-modal time series forecasting approaches often employ static and simplistic fusion mechanisms or yield non-interpretable representations with a limited modularity. We propose GMM-TS, a learnable gating architecture, inspired by mixture-of-experts, which dynamically integrates predictions from multiple uni-modal experts, each specialized in a distinct modality (e.g., text or numerical signals). Our method computes per-time-step expert weights using a Transformer Encoder. This enables fine-grained, interpretable fusion of multiple experts (two or more) and supports both joint and offline training modes. Extensive evaluations show that GMM-TS consistently outperforms state-of-the-art baselines across nine domains, multiple forecast horizons, and various expert configurations. We also include, for the first time, to the best of our knowledge, the option to integrate more than two experts. Our framework is efficient, extensible, and inherently interpretable. Code will be released upon acceptance.
Primary Area: learning on time series and dynamical systems
Submission Number: 3078
Loading