- Abstract: In health, machine learning is increasingly common, yet neural network embedding (representation) learning is arguably under-utilized for physiological signals. This inadequacy stands out in stark contrast to more traditional computer science domains, such as computer vision (CV), and natural language processing (NLP). For physiological signals, learning feature embeddings is a natural solution to data insufficiency caused by patient privacy concerns -- rather than share data, researchers may share informative embedding models (i.e., representation models), which map patient data to an output embedding. Here, we present the PHASE (PHysiologicAl Signal Embeddings) framework, which consists of three components: i) learning neural network embeddings of physiological signals, ii) predicting outcomes based on the learned embedding, and iii) interpreting the prediction results by estimating feature attributions in the "stacked" models (i.e., feature embedding model followed by prediction model). PHASE is novel in three ways: 1) To our knowledge, PHASE is the first instance of transferal of neural networks to create physiological signal embeddings. 2) We present a tractable method to obtain feature attributions through stacked models. We prove that our stacked model attributions can approximate Shapley values -- attributions known to have desirable properties -- for arbitrary sets of models. 3) PHASE was extensively tested in a cross-hospital setting including publicly available data. In our experiments, we show that PHASE significantly outperforms alternative embeddings -- such as raw, exponential moving average/variance, and autoencoder -- currently in use. Furthermore, we provide evidence that transferring neural network embedding/representation learners between distinct hospitals still yields performant embeddings and offer recommendations when transference is ineffective.
- Keywords: Representation learning, transfer learning, health, machine learning, physiological signals, interpretation, feature attributions, shapley values, univariate embeddings, LSTMs, XGB, neural networks, stacked models, model pipelines, interpretable stacked models
- TL;DR: Physiological signal embeddings for prediction performance and hospital transference with a general Shapley value interpretability method for stacked models.