Presentation Attendance: No, we cannot present in-person
Keywords: multi-scale modeling, self-supervised learning, transformers
TL;DR: We propose a multi-scale self-supervised pretraining framework for time series that learns cross-scale representations by coarse-to-fine reconstruction, improving performance across diverse downstream tasks.
Abstract: Time series inherently capture dynamic processes across multiple scales, yet standard self-supervised methods typically operate at a single scale by segmenting data into discrete patches. This disrupts temporal continuity and neglects the hierarchical interactions that define the signal, limiting adaptability across diverse tasks. To address this, we propose NoTS, a pre-training framework that learns cross-scale relationships by reconstructing fine-scale signals from progressively degraded coarse approximations. Theoretically, we show that this multi-scale sequence modeling enhances representational capacity compared to single-scale patch approaches. Empirically, NoTS achieves a 26\% improvement in synthetic feature regression and outperforms existing methods by up to 6\% across 22 real-world datasets in classification, imputation, and anomaly detection. Moreover, NoTS consistently boosts the performance of existing transformer backbones, establishing it as a theoretically grounded foundation for time series analysis.
Track: Research Track (max 4 pages)
Submission Number: 50
Loading