Keywords: Interpretable Machine Learning, Deep Tabular Learning, Biomedical Data
TL;DR: We propose ProtoGate, a prototype-based neural model for local feature selection on high-dimensional and low-sample-size datasets, especially the tabular biomedical data.
Abstract: Tabular biomedical data poses challenges in machine learning because it is often high-dimensional and typically low-sample-size. Previous research has attempted to address these challenges via feature selection approaches, which can lead to unstable performance and insufficient interpretability on real-world data. This suggests that current methods lack appropriate inductive biases that capture informative patterns in different samples. In this paper, we propose ProtoGate, a local feature selection method that introduces an inductive bias by attending to the clustering characteristic of biomedical data. ProtoGate selects features in a global-to-local manner and leverages them to produce explainable predictions via an interpretable prototype-based model. We conduct comprehensive experiments to evaluate the performance of ProtoGate on synthetic and real-world datasets. Our results show that exploiting the homogeneous and heterogeneous patterns in the data can improve prediction accuracy while prototypes imbue interpretability.
Submission Number: 28
Loading