Preferential Multi-Objective Bayesian Optimization for Drug Discovery

Published: 06 Mar 2025, Last Modified: 26 Apr 2025GEMEveryoneRevisionsBibTeXCC BY 4.0
Track: Machine learning: computational method and/or computational results
Nature Biotechnology: Yes
Keywords: Drug Discovery, Virtual Screening, Molecular Docking, Bayesian Optimization, Preference Learning, Human Feedback, Diffusion Models
TL;DR: Leveraging Bayesian Preferential Learning and diffusion-based docking models to enhance multi-objective virtual screening by incorporating chemists' preferences.
Abstract: Despite decades of advancements in automated ligand screening, large-scale docking remains resource-intensive and requires post-processing hit selection, a step where chemists manually select a few promising molecules based on their chemical intuition. This creates a major bottleneck in the virtual screening process for drug discovery, demanding experts to repeatedly balance complex trade-offs among drug properties across a vast pool of candidates. To improve the efficiency and reliability of this process, we propose a novel human-centered framework CheapVS that allows chemists to guide the ligand selection process through pairwise preference feedback. Our framework combines preferential multi-objective Bayesian optimization with an efficient diffusion docking model to capture human chemical intuition for improving hit identification. Specifically, on a library of 100K chemical candidates that target EGFR, a cancer-associated protein, CheapVS outperforms state-of-the-art docking methods in identifying drugs within a limited computational budget. Notably, our multi-objective algorithm can recover up to 16 out of 37 known drugs while scanning only 6\% of the library, showcasing its potential to advance drug discovery\footnote{Code and data for these experiments can be found at \url{https://anonymous.4open.science/r/vs-9A83}}.
Anonymization: This submission has been anonymized for double-blind review via the removal of identifying information such as names, affiliations, and identifying URLs.
Presenter: ~Sang_T._Truong1
Format: Yes, the presenting author will attend in person if this work is accepted to the workshop.
Funding: Yes, the presenting author of this submission falls under ICLR’s funding aims, and funding would significantly impact their ability to attend the workshop in person.
Submission Number: 5
Loading