Efficient structure learning of gene regulatory networks with Bayesian active learning

Dániel Sándor, Péter Antal

Published: 2025, Last Modified: 28 Jul 2025BMC Bioinform. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Gene regulatory network modeling is a complex structure learning problem that involves both observational data analysis and experimental interventions. Bayesian causal discovery provides a principled framework for modeling observational data, generating posterior distributions that best represent the underlying structure. While recent algorithms offer efficient and accurate structure learning, integrating experiment design can further enhance predictive performance. We introduce novel acquisition functions for experiment design in gene expression data, leveraging active learning in both Essential Graph and Graphical Model spaces. We evaluate scalable structure learning algorithms within an active learning framework to optimize intervention selection. Our study explores existing active learning strategies, adapts techniques from other domains to structure learning, and proposes a novel approach using Equivalence Class Entropy Sampling (ECES) and Equivalence Class BALD Sampling (EBALD). Using DREAM4’s Gene Net Weaver and Sachs protein signaling data, we assess the effectiveness of different strategies in improving network learning. Existing Bayesian experiment design strategies often overlook the Essential Graph structure, making inference more challenging due to the large number of possible graphs. Our results demonstrate that integrating active learning into structure learning algorithms can significantly improve performance, offering a scalable and effective approach for gene regulatory network discovery.