Abstract: Time series models are ubiquitous in fields of science that deal with temporally structured
data. Recent advancements in time series analysis have seen a growing trend toward the
popularity of tailor-made transformer neural networks with customized attention blocks
and hand-crafted intricate design features. We show, perhaps surprisingly and against this
current trend, that a simple time series generative transformer model, dubbed tsGT, based
on a vanilla decoder-only architecture with the discretization of real values outperforms more
sophisticated contemporary models on selected prediction tasks. We evaluate tsGT against
eleven baselines and show that it surpasses its deterministic peers on MAE and RMSE,
and the stochastic ones on QL and CRPS on four commonly used datasets: electricity,
traffic, ETTm2, and weather. We use a well-known and theoretically justified rolling
window evaluation protocol and provide a detailed analysis of tsGT’s ability to model the
data distribution and predict marginal quantile values. We provide an implementation of
our method at https://github.com/ts-gt/tsgt.
Submission Length: Regular submission (no more than 12 pages of main content)
Changes Since Last Submission: N/A
Assigned Action Editor: ~Yingnian_Wu1
Submission Number: 3000
Loading