TedPar: Temporally Dependent PARAFAC2 Factorization for Phenotype-based Disease Progression Modeling
Abstract: PARAFAC2 factorization provides a practical solution to map the temporally irregular electronic health records (EHR) to clinically relevant and interpretable phenotypes. Existing methods ignore the effect of interdependency of dis- eases over clinical history. Consequently, the crucial tempo- ral information contained in the EHR data cannot be fully utilized and the learned phenotypes can be sub-optimal to characterize patients with progressive conditions. To ad- dress this issue, we propose a novel temporally dependent PARAFAC2 (TedPar) factorization in which the temporal dependency among the phenotypes is explicitly modeled. TedPar learns a set of target phenotypes to capture the clin- ical features relevant to the diseases of interest and a set of background phenotypes to capture irrelevant but frequently co-occurring clinical features. By effectively modeling the temporal dependency and separating relevant and irrelevant features, the discovered target phenotypes can be used to model the progression of the diseases of interest. Empiri- cal evaluations show that TedPar obtains up to 32.4% rel- ative improvement in reconstruction accuracy over the test set, suggesting significantly better generalizability than the baselines for both noise-free and heavily noisy input data. Qualitative analysis also shows that TedPar is capable of discovering clinically meaningful phenotypes and capturing the temporal dependency between them.
0 Replies
Loading