Knowledge-Infused Network with Diverse Sparsity and In-Context Learning for High-Dimensional Tabular Biomedical Data
Abstract: Tabular datasets commonly have high dimensionality and limited sample sizes in biomedical field. In such contexts, neural networks often struggle with overfitting, leading to weaker performance when compared to linear and tree-based models. However, despite the abundance of auxiliary domain information on biomedical data, it is often overlooked, even though it has the potential to significantly improve neural network performance. To address this issue, we propose a Knowledge-Infused Network with Diverse Sparsity and In-Context Learning (KNDI), a novel deep learning architecture designed to handle high-dimensional, small-sample genomic data. KNDI consists of a novel Diverse Sparsity Feature Abstraction (DSFA) module and a Locally-Aligned Contextual Learning (LACL) module to infuse auxiliary biomedical knowledge and boost the performance. The DSFA module extracts diverse high-level features from high-dimensional inputs by integrating a sparsity mechanism to reduce dimensionality and applying contrastive constraints to achieve feature diversity. Subsequently, the LACL module adaptively learns contextual information from high-level feature embeddings, incorporating locality-aware interactions and preserving the original information through self-alignment, facilitating subsequent multiplication fusion. Extensive experiments show that KNDI achieves an average Pearson correlation coefficient improvement of 10.45% compared to the best performing baselines on six high-dimensional, low-sample-size genomic datasets.
External IDs:dblp:conf/bibm/WangGWS24
Loading