AANet: Virtual Screening under Structural Uncertainty via Alignment and Aggregation

Wenyu Zhu; Jianhui Wang; Bowen Gao; Yinjun Jia; Haichuan Tan; Ya-Qin Zhang; Wei-Ying Ma; Yanyan Lan

AANet: Virtual Screening under Structural Uncertainty via Alignment and Aggregation

Wenyu Zhu, Jianhui Wang, Bowen Gao, Yinjun Jia, Haichuan Tan, Ya-Qin Zhang, Wei-Ying Ma, Yanyan Lan

Published: 18 Sept 2025, Last Modified: 29 Oct 2025NeurIPS 2025 posterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Virtual screening, Drug discovery

Abstract: Virtual screening (VS) is a critical component of modern drug discovery, yet most existing methods—whether physics-based or deep learning-based—are developed around *holo* protein structures with known ligand-bound pockets. Consequently, their performance degrades significantly on *apo* or predicted structures such as those from AlphaFold2, which are more representative of real-world early-stage drug discovery, where pocket information is often missing. In this paper, we introduce an alignment-and-aggregation framework to enable accurate virtual screening under structural uncertainty. Our method comprises two core components: (1) a tri-modal contrastive learning module that aligns representations of the ligand, the *holo* pocket, and cavities detected from structures, thereby enhancing robustness to pocket localization error; and (2) a cross-attention based adapter for dynamically aggregating candidate binding sites, enabling the model to learn from activity data even without precise pocket annotations. We evaluated our method on a newly curated benchmark of *apo* structures, where it significantly outperforms state-of-the-art methods in blind apo setting, improving the early enrichment factor (EF1\%) from 11.75 to 37.19. Notably, it also maintains strong performance on *holo* structures. These results demonstrate the promise of our approach in advancing first-in-class drug discovery, particularly in scenarios lacking experimentally resolved protein-ligand complexes. Our implementation is publicly available at [https://github.com/Wiley-Z/AANet](https://github.com/Wiley-Z/AANet).

Primary Area: Machine learning for sciences (e.g. climate, health, life sciences, physics, social sciences)

Submission Number: 26314

Loading