Broadening Discovery through Structural Models: Multimodal Combination of Local and Structural Properties for Predicting Chemical Features.
Keywords: ECFP, GRAPH, LLMs, TRANSFORMERS
TL;DR: This study examines the efficacy of language models based on fingerprints within a bimodal architecture that integrates graph-based and language model components to enhance molecular representation in cheminformatics.
Abstract: In recent years, machine learning (ML) has significantly impacted the field of chemistry, facilitating advancements in diverse applications such as the prediction of molecular properties and the generation of molecular structures. Traditional string representations, such as the Simplified Molecular Input Line Entry System (SMILES), although widely adopted, exhibit limitations in conveying essential physical and chemical properties of compounds. Conversely, vector representations, particularly chemical fingerprints, have demonstrated notable efficacy in various ML tasks. Additionally, graph-based models, which leverage the inherent structural properties of chemical compounds, have shown promise in improving predictive accuracy. This study investigates the potential of language models based on fingerprints within a bimodal architecture that combines both graph-based and language model components. We propose a method that integrates the aforementioned approaches, significantly enhancing predictive performance compared to conventional methodologies while simultaneously capturing more accurate chemical information.
Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Reciprocal Reviewing: I understand the reciprocal reviewing requirement as described on https://iclr.cc/Conferences/2025/CallForPapers. If none of the authors are registered as a reviewer, it may result in a desk rejection at the discretion of the program chairs. To request an exception, please complete this form at https://forms.gle/Huojr6VjkFxiQsUp6.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 1539
Loading