TOTEM: Tokenized Time Series Embeddings For General Time Series Analysis

TOTEM: Tokenized Time Series Embeddings For General Time Series Analysis

ICLR 2024 Workshop TS4H Submission30 Authors

Published: 08 Mar 2024, Last Modified: 01 Apr 2024TS4H PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: representation learning, tokenization, time series

TL;DR: We present TOTEM: a simple, performant, time series tokenizer that works across domains and tasks thereby enabling generalist modeling with strong in-domain and zero-shot performance.

Abstract: Learning with time series health data poses many challenges such as variability in sensor semantics (e.g. neural voltage recordings vs US birth rate), difficulty in accessing data, and the relatively smaller data volume compared to other time series domains. Given these limitations, and the fact that the field of general time series analysis has recently begun to explore unified modeling, we approach unification from a complementary vantage point to ultimately benefit zero-shot performance to health time series. Historically general time series analysis unification entails when a common architectural backbone is retrained on a specific task for a specific dataset; we study the unification of time series data representations across domains in many tasks. To this end, we explore the impact of discrete, learnt, time series data representations that enable generalist, cross-domain training. Our method, TOTEM, or Tokenized Time Series Embeddings, proposes a simple tokenizer architecture that embeds time series data from varying domains using a discrete vectorized representation learned in a self-supervised manner. TOTEM works across multiple tasks and domains with minimal to no tuning. We study TOTEM’s efficacy with an extensive evaluation on 17 real world time series datasets across 3 tasks. Notably, the majority of our zero-shot datasets are time series health datasets from the neuroscience and birth domains. We evaluate both the specialist (i.e., train a model on each domain) and generalist (i.e., train a single model on many domains), and show that TOTEM matches or outperforms previous best methods on several popular benchmarks.

Submission Number: 30

Loading