Combating Missing Values in Multivariate Time Series by Learning to Embed Each Value as a Token

Combating Missing Values in Multivariate Time Series by Learning to Embed Each Value as a Token

ICLR 2024 Workshop TS4H Submission4 Authors

Published: 08 Mar 2024, Last Modified: 27 Mar 2024TS4H PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: multivariate time series data, representation learning, imputation-free, missing value

Abstract: Irregular and asynchronous sampled multivariate time series (MTS) data is often filled with missing values. Most existing methods embed features according to timestamp, requiring imputing missing values. However, imputed values can drastically differ from real values, resulting in inaccurate predictions made based on imputation. To address the issue, we propose a novel concept, “each value as a token (EVAT),” treating each feature value as an independent token, which allows for bypassing imputing missing values. To realize EVAT, we propose scalable numerical embedding, which learns to embed each feature value by automatically discovering the relationship among features. We integrate the proposed embedding method with the Transformer Encoder, yielding the Scalable nUMerical eMbeddIng Transformer (SUMMIT), which can produce accurate predictions given MTS with missing values. We induct experiments on three distinct electronic health record (EHR) datasets with high missing rates. The experimental results verify SUMMIT's efficacy, as it attains superior performance than other models that need imputation.

Submission Number: 4

Loading