Benchmarking LLMs for atomic-level geometric manipulation in crystals

Taoyuze Lv; Alexander Chen; Fengyu Xie; Yingheng Wang; Jeffrey Meng; Bram Hoex; Zhicheng Zhong; Tong Xie

Benchmarking LLMs for atomic-level geometric manipulation in crystals

Taoyuze Lv, Alexander Chen, Fengyu Xie, Yingheng Wang, Jeffrey Meng, Bram Hoex, Zhicheng Zhong, Tong Xie

Published: 24 Sept 2025, Last Modified: 26 Dec 2025NeurIPS2025-AI4Science PosterEveryoneRevisionsBibTeXCC BY 4.0

Track: Track 1: Original Research/Position/Education/Attention Track

Keywords: Large Language Models, Material Structure, 3D Structure

Abstract: Recent advancements with video generators, language aligned robotics models and tool-augmented design frameworks suggest that large language models (LLMs) may soon no longer struggle with 3D spatial reasoning. To bring these developments into the material sciences, we present AtomWorld, a data generator and benchmark that evaluates LLMs on atomic-level operations (e.g. insert, move, rotate atoms) in CIF files. This benchmark was tested across major chat models, finding these models to generally take an algorithmic approach - which yielded successful completion of simple tasks such as adding and moving atoms, but struggled with more complex tasks such as rotating around an atom. LLM inaptitude with spatial reasoning limits their usefulness in crystallography - addressing this problem is a necessary first step towards enabling higher level tasks such as seeing motifs, symmetries, repairing or validating complex structures, and proposing novel structures.

Submission Number: 178

Loading