Keywords: semi-supervised time series classification, missing data
TL;DR: Hidden Markov models handle missingness naturally and can match or beat deep networks when supervised with a carefully-designed training objective.
Abstract: When predicting outcomes for hospitalized patients, two key challenges are that the time series features are frequently missing and that supervisory labels may be available for only some sequences. While recent work has offered deep learning solutions, we consider a far simpler approach using the hidden Markov model (HMM). Our probabilistic approach handles missing features via exact marginalization rather than imputation, thereby avoiding predictions that depend on specific guesses of missing values that do not account for uncertainty. To add effective supervision, we show that a prediction-constrained (PC) training objective can deliver high-quality predictions as well as interpretable generative models. When predicting mortality risk on two large health records datasets, our PC-HMM's precision-recall performance is equal or better than the common GRU-D even with 100x fewer parameters. Furthermore, when only a small fraction of sequences have labels, our PC-HMM approach can beat time-series adaptations of MixMatch, FixMatch, and other state-of-the-art methods for semi-supervised deep learning.
0 Replies
Loading