SymForce: Large Language Models as Symbolic Physics Engines for Molecular Conformation

17 Sept 2025 (modified: 12 Feb 2026)ICLR 2026 Conference Desk Rejected SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Molecular Conformation Generation, Symbolic Force Generator, Forphysics-Informed AI, Large Language Model
TL;DR: A framework employing a large language model as a symbolic force generator and then translated into numerical force vectors through a differentiable coordinate update mechanism.
Abstract: Prevailing methods for molecular conformation generation treat 3D structures as static prediction targets, a significant limitation in chemistry. We improve upon this paradigm by reconceptualizing the task as a dynamic process of physical reasoning. Our framework, SymForce, employs a large language model (LLM) as a symbolic physics engine that generates corrective force instructions based on geometric deviations. These symbolic forces then guide an iterative, differentiable optimization to refine the 3D structure. SymForce achieves state-of-the-art performance with a 0.81 Å mean RMSD on GEOM-Drugs. Critically, it exhibits vastly superior generalization to large, out-of-distribution molecules, with performance degrading by only 34.6\% compared to 70.6\% for a leading diffusion-based method. Ablation studies confirm this symbolic reasoning is essential, as its removal causes a 42.0\% performance drop. By integrating an LLM as a symbolic reasoner within a physical simulation loop, SymForce establishes a new paradigm for physics-informed AI and opens new directions in molecular modeling and beyond. The code is available at https://anonymous.4open.science/r/SymForce_code.
Supplementary Material: zip
Primary Area: foundation or frontier models, including LLMs
Submission Number: 9434
Loading