What If TSF: A Multimodal Benchmark for Conditional Time Series Forecasting with Plausible Scenarios

What If TSF: A Multimodal Benchmark for Conditional Time Series Forecasting with Plausible Scenarios

ICLR 2026 Conference Submission16804 Authors

19 Sept 2025 (modified: 08 Oct 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Multimodality, Time Series Forecasting, Future Scenarios, Counterfactuals

TL;DR: Time series forecasting still relies on historical patterns, while multimodal gains remain limited by redundant text. The WIT Benchmark provides expert-authored “what-if” scenarios to test whether models go beyond pattern replication.

Abstract: Time series forecasting has long been constrained by history-bound, unimodal methods and benchmarks that fail to capture predictive, forward-looking context. Recent progress in large language models and multimodal alignment suggests richer possibilities, yet most existing multimodal benchmarks rely on textual descriptions that merely repeat historical patterns and can introduce misleading signals due to irrelevant context. To advance research in this area, we introduce ``What If TSF (WIT)'', a benchmark constructed around expert-crafted what-if scenarios and explicit future events. WIT encourages models not only to match historical patterns but also to reason under uncertainty, evaluating their ability to integrate multimodal signals, anticipate plausible futures, and enable conditional forecasts. By moving beyond historical pattern extraction, WIT establishes a principled testbed for scenario-guided multimodal forecasting.

Supplementary Material: zip

Primary Area: datasets and benchmarks

Submission Number: 16804

Loading