Spectral Manifold Harmonization for Graph Imbalanced Regression

Published: 13 Jul 2025, Last Modified: 13 Jul 2025DIG-BUG ShortEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Machine Leaning, Imbalance Learning, Molecular Property Prediction
TL;DR: We overcome the limitations of existing oversampling techniques in regression settings by generating realistic synthetic examples that address the imbalance problem without distorting the underlying graph topology.
Abstract: Graph-structured data is ubiquitous in scientific domains, where models often face imbalanced learning settings. In imbalanced regression, domain preferences focus on specific target value ranges representing the most scientifically valuable cases; we observe a significant lack of research. In this paper, we present Spectral Manifold Harmonization (SMH), a novel approach for addressing this imbalanced regression challenge on graph-structured data by generating synthetic graph samples that preserve topological properties while focusing on often underrepresented target distribution regions. Conventional methods fail in this context because they either ignore graph topology in case generation or do not target specific domain ranges, resulting in models biased toward average target values. Experimental results demonstrate the potential of SMH on chemistry and drug discovery benchmark datasets, showing consistent improvements in predictive performance for target domain ranges.
Submission Number: 44
Loading