Confounder-aware foundation modeling for accurate phenotype profiling in cell imaging

Published: 23 Jun 2025, Last Modified: 23 Jun 2025Greeks in AI 2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: 1. Vision and Learning, 4. AI for Health, 5. AI for Science, 6. Other (Causal Modeling and Inference, Foundation Modeling, Drug Discovery)
TL;DR: Current AI for drug discovery using cell images struggles with new drugs and lab variations; we built a novel confounder-aware foundation model that solves this, significantly improving predictions of how drugs work and what they target.
Abstract: This paper introduces a confounder-aware foundation model with a causal mechanism embedded within a latent diffusion model to enhance phenotype profiling in cell imaging for drug discovery. Cell imaging provides detailed observable characteristics of cells, crucial for understanding drug effects. Identifying a drug's mechanism of action (MoA)—how it works at a molecular level—and predicting its compound targets—the specific molecules it interacts with—are vital for developing effective therapies. Current AI methods for cell image analysis often fail to generalize to new compounds, limiting the discovery of novel drugs. Furthermore, experimental variability across research centers hinders the accurate determination of MoA and compound targets. To overcome these limitations, we developed our confounder-aware foundation model, trained on over 13 million Cell Painting images and 107 thousand compounds. Our model learns robust cellular representations, effectively mitigating confounding factors and achieving state-of-the-art performance in MoA and target prediction for both known (0.66 and 0.65 ROC-AUC) and novel compounds (0.65 and 0.73 ROC-AUC), significantly outperforming existing methods. This innovative framework accelerates drug discovery by enabling robust biological effect estimations for new compounds, facilitating hit expansion—the identification of more promising drug candidates. Our model provides a scalable foundation for cell imaging in data-driven drug discovery. This work has been published as a preprint on bioRxiv and is under review in npj imaging by Nature.
Submission Number: 9
Loading