Keywords: Time Series, Time Forecasting, Transformer, Frequency
Abstract: Current time series forecasting architectures mainly rely on single unified
solutions that lack specializations, limiting their ability to adapt to different
temporal dependencies within the same model. These approaches struggle to
efficiently capture the heterogeneous nature of time series data, where different
subsequences may require distinct modelings. To address these challenges, we
propose MFMformer: Multi-resolution Mixture-of-Experts gating for Time Series
Forecasting that combines multi-scale temporal processing with MoE layers.
MFMformer introduces two key innovations: (i) an overlapping multi-resolution
decomposition mechanism that splits input sequences into 50% overlapping
chunks across multiple temporal scales, with instance normalization applied
independently to each scale, inspired by short-term Fourier transformation; (ii)
Mixture-of-Experts gating that uses the top-3 dominant frequencies from FFT
analysis to route inputs between 2 specialized expert networks, enhancing both
representational capacity and computational efficiency. Extensive benchmarks on
long-term and short-term time series datasets show that MFMformer shows state-of-the-art
results that are comparable to existing methods.
Primary Area: learning on time series and dynamical systems
Submission Number: 24040
Loading