From Images to Signals: Are Large Vision Models Useful for Time Series Analysis?

TMLR Paper9394 Authors

02 Jun 2026 (modified: 03 Jun 2026)Under review for TMLREveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large Vision Models (LVMs) are emerging tools for transferring cross-modal knowledge to time series, but their potential for this domain is not yet fully understood. This work addresses the gap by investigating LVMs for both high-level (classification) and low-level (forecasting) time series tasks. Our aim is not only to assess whether LVMs can succeed, but also to reveal why they succeed or fall short. Through a comparative benchmark covering four representative LVMs, eight imaging methods, 18 datasets, and 21 baselines, we identify the strengths and limitations of the foundational LVMs, as well as effective strategies for adapting them to time series modeling. Our findings indicate that while the LVMs are effective for time series classification, they face notable challenges in forecasting. In particular, the best-performing LVM-based forecaster is limited to specific model types and imaging methods, exhibits biases toward periodic components in time series, and struggles to leverage long look-back windows. We hope our findings will serve as both a cornerstone and a practical guide for advancing future research on LVM- and multimodal-based solutions for different time series tasks.
Submission Type: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Markus_Lange-Hegermann1
Submission Number: 9394
Loading