CausalBind: Causal Concept Alignment for Protein-Ligand Virtual Screening

Published: 02 Mar 2026, Last Modified: 02 Apr 2026GEM 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Virtual Screening, Drug Discovery, Protein-Ligand Binding, Causal Alignment
TL;DR: This paper proposed CausalBind, a plug-in causal concept extraction module for protein-ligand virtual screening that can be seamlessly integrated into existing contrastive learning frameworks.
Abstract: Drug discovery is a lengthy and costly process, with virtual screening serving as a critical computational step to identify promising drug candidates from vast compound libraries. While contrastive learning has emerged as a powerful paradigm for protein-ligand virtual screening by aligning molecular and protein pocket embeddings, existing methods directly align entire representations without distinguishing binding-relevant from binding-irrelevant features. This can lead to spurious correlations that limit generalization to novel drug targets. In this work, we propose a \textit{plug-in causal concept extraction module} that decomposes entangled representations into disentangled atomic concepts using cross-attention and employs learnable sparse masking to identify causally relevant binding features. Experiments show that across both scratch and pretrained settings, our causal alignment consistently improves early enrichment (BEDROC and EF@1\%), with the largest gains observed on the more realistic LIT-PCBA benchmark, indicating better prioritization of true binders despite marginal changes in overall AUC.
Submission Number: 79
Loading