Hierarchical Molecular Representation Learning via Fragment-Based Self-Supervised Embedding Prediction

Jiele Wu; Haozhe Ma; Zhihan Guo; Thanh Vinh Vo; Tze-Yun Leong

Hierarchical Molecular Representation Learning via Fragment-Based Self-Supervised Embedding Prediction

Jiele Wu, Haozhe Ma, Zhihan Guo, Thanh Vinh Vo, Tze-Yun Leong

18 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Hierarchical Molecule representation learning, fragment based, Embedding prediction

TL;DR: Self-supervised molecule graph representation learning through hierarchical fragment based embedding prediction

Abstract: Graph self-supervised learning (GSSL) has demonstrated strong potential for generating expressive graph embeddings without the need for human annotations, making it particularly valuable in domains with high labeling costs such as molecular graph analysis. However, existing GSSL methods mostly focus on node- or edge-level information, often ignoring chemically relevant substructures which strongly influence molecular properties. In this work, we propose Graph Semantic Predictive Network (GraSPNet), a hierarchical architecture that predicts both node and semantically meaningful fragments of a graph in the embedding space. GraSPNet decomposes molecular graphs into meaningful fragments without relying on predefined chemical vocabulary and learns graph representations through message-passing graph neural networks. It further captures fragment-level semantics by encoding fragment information and modeling interactions through node-fragment and fragment-fragment message passing. By performing masked prediction of node and fragment features in semantic space, GraSPNet captures structural information at multiple resolutions. Experiments show that GraSPNet is both expressive and generalizable, outperforming existing state-of-the-art methods on multiple molecular property prediction benchmarks in transfer learning settings. The code will be released upon acceptance.

Primary Area: learning on graphs and other geometries & topologies

Submission Number: 14492

Loading