PMO-Dock: Benchmarking Docking, Specificity and Generalization in Molecular Optimization
Keywords: Benchmark, Drug Discovery, ML, Molecular optimization
Abstract: The Practical Molecular Optimization (PMO) benchmark standardized evaluation in the field, but its objectives are largely limited to simple property-based oracles. Recent methods evaluating molecules with docking-based objectives move closer to real drug-discovery settings, but unlike PMO, they are evaluated under fragmented protocols that make comparisons inconsistent. We introduce PMO-Dock, a new benchmark and evaluation protocol for docking-based molecular
optimization that enforces strict generalization: algorithms must tune hyperparameters on a validation set of diverse protein targets and freeze them for a distinct test set. PMO-Dock contains 23 tasks covering hit generation, lead optimization, and novel “docking with specificity” tasks that require strong on-target binding while penalizing off-target interactions. We benchmark four diverse high-performing methods spanning different optimization paradigms: Saturn (reinforcement learning),
GenMol (discrete diffusion), Genetic-guided GFlowNet, and Chemlactica (LLM-
based). Our extensive analysis reveals that hyperparameters maximizing performance on validation targets rarely transfer to test tasks, highlighting a critical fragility in current state-of-the-art methods. Our PMO-Dock supports task-aware hyperparameter selection without test-set overfitting, providing a robust foundation for the next generation of generalizable molecular optimizers.
Submission Number: 107
Loading