Leveraging Transfer Learning and Multimodal Foundation Models for Antibiotic Discovery Against Data-Scarce Escherichia coli Strains
Track: Full Paper Track
Keywords: Foundation model, Multimodal Representation Learning, CL-MFAP, D-MPNN, Transfer Learning, Drug Discovery, Antibiotic Discovery, Virtual Screening
Abstract: Antibacterial resistance is a growing global crisis, complicating the treatment of bacterial infections and bacteria-implicated diseases while increasing healthcare costs and mortality. As such, there is a pressing need for the development of novel antibiotics, but traditional drug discovery methods are costly and slow. Turning to artificial intelligence (AI) and deep learning (DL) models allows us to combat these issues, but for bacterial strains with limited experimental data for DL model training, the benefits of AI are limited. Recently, we developed CL-MFAP, an unsupervised contrastive learning (CL)-based multimodal foundation (MF) model specifically tailored for discovering small molecules with potential antibiotic properties (AP), which has shown great success in antibiotic screening. Also, Deep Message Passing Neural Networks (D-MPNN) are graph neural networks designed for molecular analysis and widely used for antibiotic screening. To combat the issue of experimental data scarcity, we propose a novel pipeline that combines these complementary architectures with transfer learning: both models are first trained on a larger, more general antibacterial dataset, and their learned embeddings are then used to train strain-specific classifiers. This approach enables effective prediction even for bacterial strains with limited data. The pipeline also incorporates extensive virtual screening of almost 11 million commercially available compounds and downstream property prediction analysis to prioritize candidates before experimental validation, significantly reducing resource requirements. By identifying potential novel antibiotic compounds for Adherent-Invasive *Escherichia coli* LF82 (AIEC LF82), we demonstrate our pipeline's potential for effective antibiotic discovery in data-scarce scenarios.
Attendance: Sugitha Janarthanan, Gen Zhou
Submission Number: 65
Loading