Scaling Pocket Docking with Data Augmentation and Heterogeneous Equivariant Graph Attention

Published: 28 May 2026, Last Modified: 28 May 2026GenBio 2026 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: All-Atom Modeling, Synthetic Data Augmentation, Equivariant Graph Attention, Molecular Docking, Pose Ranking
TL;DR: We achieve state-of-the-art all-atom docking by combining large-scale data augmentation with a fast Heterogeneous Equivariant Graph Attention (HeteroEGA) ranking pipeline.
Abstract: Accurate pocket-level molecular docking is limited by narrow training distributions, costly all-atom modeling, and ranking functions that often fail to select the best pose from high-quality generated ensembles. We introduce a scalable docking pipeline that combines large-scale data augmentation, efficient all-atom equivariant modeling, and improved pose ranking. Our diffusion model is trained on curated PLINDER complexes augmented with SAIR synthetic structures, using leakage removal, structural quality filtering, and dataset-specific pocket cutoffs to improve generalization. For ranking, we replace fully connected tensor-product convolutions in the confidence model with Heterogeneous Equivariant Graph Attention (HeteroEGA), enabling interaction-specific attention across ligand, receptor-residue, and receptor-atom graphs. We also evaluate an independent physics-based refinement track using Vina minimization followed by GNINA reranking. On PoseBusters-308, our confidence-ranking pipeline achieves 81.85\% Top-1 success rate and 94.51\% Oracle success rate, surpassing SigmaDock and DiffDock-RL++. The proposed HeteroEGA confidence model slightly outperforms the post-refinement track while ranking poses substantially faster. These results show that combining broader training data with attention-based equivariant ranking can close much of the gap between Top-1 and Oracle docking accuracy.
Email Sharing: We authorize the sharing of all author emails with Program Chairs.
Data Release: We authorize the release of our submission and author names to the public in the event of acceptance.
Submission Number: 149
Loading