Abstract: Recently, large language models (LLMs) have shown significant progress, approaching human perception levels. In this work, we demonstrate that despite these advances, LLMs still struggle to reason using molecular structural information. This gap is critical because many molecular properties, including functional groups, depend heavily on such structural details. To address this limitation, we propose an approach that sketches molecular structures for reasoning. Specifically, we introduce Molecular Structural Reasoning (MSR) framework to enhance the understanding of LLMs by explicitly incorporating the key structural features. We present two frameworks for scenarios where the target molecule is known or unknown. We verify that our MSR improves molecular understanding through extensive experiments.
Paper Type: Long
Research Area: NLP Applications
Research Area Keywords: healthcare applications, fine-tuning, molecule
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data analysis
Languages Studied: English, SMILES
Submission Number: 2845
Loading