Dual-MoE: Learning Time and Channel Dependencies via Dual Mixture-of-Experts for Time Series Forecasting

17 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Time Series Forecasting
Abstract: Multivariate time series forecasting holds significant value in finance, energy, and transportation systems, yet faces critical challenges in jointly modeling temporal heterogeneity and dynamic channel dependencies. Existing approaches exhibit limitations in balancing long-term trends with short-term fluctuations, while struggling to capture time-varying inter-variable relationships. This paper proposes Dual-MoE, a dual mixture-of-experts framework that synergistically integrates temporal and channel modeling. The temporal expert dynamically combines multi-scale historical features (e.g., hourly details and weekly patterns) through adaptive gating mechanisms, whereas the channel expert learns dependency weights between variables via frequency-aware interaction modeling. Extensive experiments on real-world datasets demonstrate Dual-MoE's superior forecasting accuracy and robustness compared to state-of-the-art baselines. Its modular architecture provides a flexible and scalable paradigm for complex temporal dependency modeling, paving the way for further advancements in time series analysis. Code is available in Appendix.
Supplementary Material: zip
Primary Area: learning on time series and dynamical systems
Submission Number: 9221
Loading