Enhancing molecular representation via fusion of multimodal transformers with integrated periodic local and global features

Jia Ao, Xiangsheng Huang, Wei Dai, Cancan Ji

Published: 2025, Last Modified: 28 Feb 2026J. Comput. Aided Mol. Des. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Due to the complexity of molecules, molecular learning requires a large amount of molecular data. However, labeled data is typically limited, making self-supervised pretraining methods essential. Despite this, current pretraining methods often fail to sufficiently focus on both local and global molecular information. In this study, we propose a multi-modality self-supervised learning framework that simultaneously captures local and global information. Specifically, we encode SMILES sequences and molecular graphs separately and use a unified fusion approach to strengthen the interaction between the two modalities. Moreover, in the molecular graph encoding, we independently capture global and local information, and enhance the attention to bond features through information fusion. Additionally, we introduce the FA-FFN module to aggregate periodic features of the molecule. Experimental results show that MoleTGL exhibits superior performance compared to existing methods on seven classification tasks and six regression tasks related to molecular property prediction, and ablation studies confirm the effectiveness of local and global feature fusion and the superiority of the methods for acquiring local and global information.

External IDs:dblp:journals/jcamd/AoHDJ25