Keywords: Molecule; Reasoning; Large language model;
Abstract: Large Language Models (LLMs) have demonstrated remarkable performance
across various domains, yet their capabilities in molecular reasoning remain in-
sufficiently explored. Current approaches tend to rely heavily on general-purpose
prompting, which lacks domain-specific molecular semantics, while those that
use fine-tuning strategies often face challenges with interpretability and reasoning
depth. To address these issues, we introduce MolReasoner, a two-stage frame-
work designed to transition LLMs from memorization towards chemical reason-
ing. First, we propose Mol-SFT, which initializes the model’s reasoning abili-
ties by distilling high-quality reasoning Chain-of-Thought (CoT) trajectories from
GPT-4o, enriched with structural features and functional group information, and
verified for chemical accuracy, enabling the model to internalize coherent and
chemically meaningful reasoning. In the Mol-RL stage, we propose verifiable and
extensible multi-level rewards, where language- and structural-similarity rewards
provide fine-grained semantic and structural alignment. Moreover, we introduce
more comprehensive metrics, together with a multi-dimensional expert-aligned
pipeline to rigorously assess reasoning quality. Extensive experiments demon-
strate that MolReasoner outperforms existing methods, and marking a significant
shift from memorization-based outputs to robust chemical reasoning. The code for
MolReasoner is included in the supplementary materials and will be open-sourced
in the near future.
Supplementary Material: zip
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 6596
Loading