MotifScreen: Generalizing Virtual Screening through Learning Protein-Ligand Interaction Principles

ICLR 2026 Conference Submission5508 Authors

15 Sept 2025 (modified: 20 Nov 2025)ICLR 2026 Conference SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Virtual Screening, Protein-Ligand Interaction Prediction, Multitask learning, Stucture-Based Drug Design, Benchmark, Graph Neural Networks
TL;DR: We propose MotifScreen, a structure-based virtual screening (VS) method that overcomes limitations of many deep learning VS models through principle-guided multitask learning, along with a new benchmark set designed to test true generalization.
Abstract: Virtual screening methods continue to face a fundamental trade-off between accuracy and efficiency. Deep learning-based methods attempting to address this challenge suffer from overfitting due to sparse and biased training data and inadequate validation practices. We first show that the over-optimism in prevalent deep learning-based methods is due to incorrect validation setups, and their actual performance approaches that of random selection. We then present MotifScreen, a structure-based end-to-end virtual screening method that addresses these limitations through principle-guided multi-task learning. We ask our network to rationalize the prediction by understanding the principles of protein-ligand interactions in a step-by-step manner: 1) receptor pocket analyses, 2) ligand-pocket chemical compatibility, and 3) ligand binding probability given its compatibility. This multi-task framework, trained on a new dataset specifically curated for the task, significantly outperforms existing methods and classification-only baselines when evaluated on a stand-alone test set.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 5508
Loading