Latent Crystallographic Microscope: Probing the Emergent Crystallographic Knowledge in Large Language Models

Jingru Gan; Yanqiao Zhu; Wei Wang

Latent Crystallographic Microscope: Probing the Emergent Crystallographic Knowledge in Large Language Models

Jingru Gan, Yanqiao Zhu, Wei Wang

Published: 30 Sept 2025, Last Modified: 30 Sept 2025Mech Interp Workshop (NeurIPS 2025) PosterEveryoneRevisionsBibTeXCC BY 4.0

Keywords: Applications of interpretability, Probing, Other

TL;DR: This work presents comprehensive mechanistic understanding of scientific reasoning in LLMs, providing foundations for interpretable AI systems in materials discovery.

Abstract: Large language models have demonstrated their capabilities in materials science, generating thermodynamically stable crystal structures without explicit domain training. However, the internal mechanisms enabling this scientific reasoning remain unclear, limiting our ability to develop reliable and controllable AI systems for materials discovery. This work investigates how LLMs encode crystallographic knowledge, process multi-structure reasoning, and whether mechanistic insights can enable controlled crystal structure optimization. We introduce the Latent Crystallography Microscope (LCM), the first mechanistic interpretability framework designed to reverse-engineer crystallographic reasoning in large language models. Through systematic linear probing across Llama 3.1-70B's transformer layers, we identify a hierarchical knowledge architecture where crystallographic concepts emerge across distinct processing phases from early chemical composition through intermediate thermodynamic and geometric reasoning to final symmetry classification. Our attention flow analysis reveals strong position bias effects in computational resource allocation. We further expose the limitations of prompt-based control approaches through ablation experiments. Moving beyond prompt-level control, we demonstrate that mechanistic insights enable targeted manipulation of crystal structure generation through layer-specific neural interventions, achieving systematic improvements in thermodynamic stability while preserving structural diversity. This work investigates scientific reasoning mechanisms in large language models and demonstrates that mechanistic interpretability can enable practical control over materials discovery processes, providing critical foundations for developing interpretable and controllable AI systems that can serve as reliable tools in autonomous materials discovery.

Submission Number: 8

Loading