A Transformer-based Multi-modal Joint Attention Fusion Model for Molecular Property Prediction

Ke Wang, Wei Zhang, Yong Liu

Published: 01 Jan 2023, Last Modified: 08 Feb 2025BIBM 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Molecular property prediction plays a crucial role in drug screening and discovery scenarios. The critical task of it is to obtain the embedding of effective molecular structures. Textual sequences and graphs are commonly used to describe molecules. Previous efforts have attempted to combine these modalities to address the issue of information loss in single-modal representations across diverse tasks. Therefore, integrating chemical information from different modalities should be considered for more accurate representations. Given the advantages of Transformers in various fields of artificial intelligence, leveraging the attention mechanism to integrate molecular sequence and graph representations is desirable for achieving improved molecular embeddings. To this end, we propose a deep learning model called MJAF and design a novel information fusion strategy based on joint attention mechanisms. This approach effectively harnesses the strengths of both molecular representation modalities, significantly enhancing the efficiency of embedding molecules. We conducted multiple experiments comparing our model with state-of-the-art methods, experimental results on 4 independent datasets demonstrate significant advancements achieved by our proposed model.