Super-Linear: A Lightweight Pretrained Mixture of Linear Experts for Time Series Forecasting

Liran Nochumsohn; Raz Marshanski; Hedi Zisling; Omri Azencot

Super-Linear: A Lightweight Pretrained Mixture of Linear Experts for Time Series Forecasting

Liran Nochumsohn, Raz Marshanski, Hedi Zisling, Omri Azencot

Published: 07 May 2026, Last Modified: 07 May 2026Accepted by TMLREveryoneRevisionsBibTeXCC BY 4.0

Abstract: Time series forecasting (TSF) is critical in domains like energy, finance, healthcare, and logistics, requiring models that generalize across diverse datasets. Large pre-trained models such as Chronos and Time-MoE show strong zero-shot (ZS) performance but suffer from high computational costs. In this work, we introduce Super-Linear, a lightweight and scalable mixture-of-experts (MoE) model for general forecasting. It replaces deep architectures with simple frequency-specialized linear experts. A lightweight spectral gating mechanism dynamically selects relevant experts, enabling efficient, accurate forecasting. Crucially, resampling during training exposes the model to diverse frequency regimes, while a flexible input adaptation strategy allows it to handle varying inference lengths. Despite its simplicity, Super-Linear demonstrates strong performance across benchmarks, while substantially improving efficiency, robustness to sampling rates, and interpretability.

Submission Type: Regular submission (no more than 12 pages of main content)

Changes Since Last Submission: **Submission** * Added comparison to Chronos-2 (Figure 1, Figure 5, and Tab. 3). * Added clarifications based on reviewer comments, and fixed typos. * In Table 2 (right), added a comparison to a setup that does not resample the data during training (WO Tr Resamp) * Added a new section that compared Super-Linear to LLM based models (Appendix B.4) * Added comparison to DUET, and FEDformer (Appendix Table 4) * Added a case study visualization (Appendix I) * Improved emphasis on resampling (augmentation for training and during inference), in the abstract, introduction and conclusion. * Elaborated in the discussion around Table 2, in Section 4.2, method ablation. **Camera Ready** * Removed the red marking for change tracking during the discussion phase. * Reorder of figures/ tables in the main manuscripts due to misplacement caused by the **accepted** template. * Added Training–Evaluation Overlap Analysis in Sec H.1 (Reviewer 16zx).

Code: https://github.com/azencot-group/SuperLinear

Assigned Action Editor: ~Jacek_Cyranka1

Submission Number: 7196

Loading