Keywords: Out-of-Distribution, OOD, Out-of-Distribution Generalization, Molecular Property Prediction, MPP, Transductive Learning, Multi-Anchor Reasoning, Molecular Encoders, Moelcule Representations, Drug Discovery, Materials Science, Machine Learning, Deep Learning, Extrapolation
TL;DR: We propose a novel multi-anchor transductive framework operating in learned latent spaces that improves out-of-distribution generalization for molecular property prediction by leveraging multiple relevant analogues.
Abstract: Predicting molecular properties outside the training data distribution (Out-of-Distribution, OOD) is critical for accelerating drug discovery. This task requires models to extrapolate beyond known property ranges and generalize to novel chemical structures—a common failure point for standard machine learning models. While transductive analogical reasoning shows promise, prior methods are often constrained by fixed descriptors and single-anchor comparisons. To overcome these limitations, we introduce Multi-Anchor Latent Transduction (MALT) framework, which operates directly within a learned latent space. MALT can leverage embeddings from any powerful, pre-trained molecular encoder to select multiple relevant analogues of query molecule. It then integrates the query and anchor embeddings to generate a final prediction. On rigorous OOD benchmarks targeting shifts in both property values and chemical features, MALT consistently improves generalization over standard inductive baselines. Notably, our framework also matches or surpasses the in-distribution performance of these base models. These findings establish multi-anchor transduction in latent space as an effective strategy to augment existing molecular encoders, enabling robust and extrapolative predictions needed to solve challenging discovery tasks.
Supplementary Material: zip
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 15016
Loading