Keywords: imbalanced regression; distribution-guided routing; small-data regression; gating supervision loss; contrastive representation learning
TL;DR: DistRouting improves molecular property prediction on imbalanced targets by routing samples to specialized experts based on molecular features and using interval-aware contrastive learning to enhance representation
Abstract: Molecular property regression often suffers from target distribution imbalance, where standard models tend to overfit to dense target regions and underperform on rare but critical ones. This limitation is particularly problematic in virtual screening, where compounds with rare property values are often of special interest. To address this challenge, we propose DistRouting, a novel distribution-aware expert routing module designed to improve model robustness under imbalanced regression settings. DistRouting partitions the target space into intervals and assigns each expert to specialize in a specific target range. Expert assignment is driven by a hybrid routing mechanism that leverages both molecular embeddings and physicochemical descriptors. To further encourage distribution-aligned representation learning, we introduce an interval-aware supervised contrastive loss that brings together samples from the same target interval and pushes apart those from different ones. Extensive experiments on multiple molecular property benchmarks show that models equipped with DistRouting consistently outperform their vanilla counterparts, especially in rare target regions. Moreover, DistRouting leads to predicted distributions that better align with the true target distributions. These findings demonstrate the effectiveness of DistRouting as a plug-in module for addressing the challenge of imbalanced molecular property regression.
Supplementary Material: zip
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 13275
Loading