Keywords: explainability, novel architectures, classification, missing values, noisy/irregular measurements, transformers
TL;DR: Performing both temporal and sensor attention, the Parallel Attention Transformer is SOTA competitive for EHR classification and has a structure allowing for multidimensional explanations.
Abstract: When working with electronic health records (EHR), it is critical for deep learning (DL) models to achieve both high performance and explainability. Here we present the Parallel Attention Transformer (PAT), which performs temporal and sensor attention in parallel, is competitive to state-of-the-art models in EHR classification, and has a uniquely explainable structure. PAT is trained on two EHR datasets, compared to five DL models of different architectures, and its attention weights are used to visualize key sensors and time points. Our results show that PAT is particularly well-suited for healthcare and pharmaceutical applications, which have a strong interest in identifying key features to differentiate patient groups and conditions, and key times for intervention.
Submission Number: 24
Loading