Deep learning virtual screening with active signature learning improves the identification of small-molecule modulators of complex phenotypes

Published: 17 Jun 2024, Last Modified: 17 Jul 2024ICML2024-AI4Science SpotlightEveryoneRevisionsBibTeXCC BY 4.0
Keywords: drug discovery, computational biology, single-cell, transcriptomics, genomics
TL;DR: We developed a generalizable closed-loop AI framework that significantly improves the efficiency of phenotypic drug discovery by predicting compound effects on cells from their molecular profiles.
Abstract: Phenotypic drug discovery holds promise for developing new medicines but is limited by throughput and scalability. Current application of AI to improve screening efficiency relied on single-use models trained on a phenotype-specific high throughput screen. We introduce a generalizable deep learning framework leveraging omics data to prioritize compounds for virtually any phenotype using a single model. We also developed a novel closed-loop active signature learning procedure to optimize the omics signature associated with a target phenotype. We trained our model on over 425,000 perturbation signatures and validated it using a new single-cell transcriptomics benchmark dataset profiling 88 perturbations across 10 cell lines. Our approach outperformed published methods by 15-80\% and led to a 16-19X increase in productivity in two hematology phenotypic discovery campaigns, providing the first experimental validation that deep learning and omics data can improve the productivity of phenotypic discovery in a real-world setting. We next demonstrated the ability of our active signature learning algorithm to refine hit compound prioritization and gain mechanistic insights through an integrative lab-in-the-loop framework. This approach enables rational drug design targeting complex phenotypes, ushering in a new era of drug discovery.
Submission Number: 80
Loading