Pruning neural network models for gene regulatory dynamics using data and domain knowledge

Intekhab Hossain; Jonas Fischer; Rebekka Burkholz; John Quackenbush

Pruning neural network models for gene regulatory dynamics using data and domain knowledge

Intekhab Hossain, Jonas Fischer, Rebekka Burkholz, John Quackenbush

Published: 25 Sept 2024, Last Modified: 09 Jan 2025NeurIPS 2024 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: neural network pruning, sparsification, domain knowledge, gene regulation

TL;DR: We leverage domain knowledge to inform neural network pruning, thereby obtaining interpretable models that align with known biology

Abstract: The practical utility of machine learning models in the sciences often hinges on their interpretability. It is common to assess a model's merit for scientific discovery, and thus novel insights, by how well it aligns with already available domain knowledge - a dimension that is currently largely disregarded in the comparison of neural network models. While pruning can simplify deep neural network architectures and excels in identifying sparse models, as we show in the context of gene regulatory network inference, state-of-the-art techniques struggle with biologically meaningful structure learning. To address this issue, we propose DASH, a generalizable framework that guides network pruning by using domain-specific structural information in model fitting and leads to sparser, better interpretable models that are more robust to noise. Using both synthetic data with ground truth information, as well as real-world gene expression data, we show that DASH, using knowledge about gene interaction partners within the putative regulatory network, outperforms general pruning methods by a large margin and yields deeper insights into the biological systems being studied.

Primary Area: Machine learning for healthcare

Submission Number: 19966

Loading