Latent Crystallographic Microscope: Probing the Emergent Crystallographic Knowledge in Large Language Models

Published: 30 Sept 2025, Last Modified: 20 Nov 2025Mech Interp Workshop (NeurIPS 2025) PosterEveryoneRevisionsBibTeXCC BY 4.0
Open Source Links: https://github.com/JingruG/LCM
Keywords: Applications of interpretability, Other
TL;DR: We introduce the Latent Crystallography Microscope (LCM), a mechanistic interpretability framework for reverse-engineering crystallographic reasoning in large language models.
Abstract: Recent works are exploring the application of large language models to materials discovery, from property prediction to structure generation. However, the internal mechanisms through which LLMs perform crystallographic understanding and reasoning tasks remain unexplored. This lack of mechanistic understanding prevents the development of principled approaches for reliable materials discovery. We introduce the Latent Crystallography Microscope (LCM), a mechanistic interpretability framework for reverse-engineering crystallographic reasoning in large language models. We conduct three experiments mapping the progression from mechanistic understanding to controlled intervention. First, format recognition and property extraction tasks reveal that LLMs excel at direct metadata retrieval but struggle with geometric computations, indicating reliance on pattern matching over true geometric reasoning. Second, activation patching identifies task-specific neural circuits where attention heads mediate information routing while MLP blocks encode abstract crystallographic rules, with computational onset progressing to later layers as task complexity increases. Third, onset layer interventions during structure generation demonstrate that these mechanistic insights enable targeted neural modifications, though intervention effectiveness remains material-system dependent. Our analysis locates crystallographic computations to specific neural circuits, providing intervention targets for future work. This work maps the computational mechanisms underlying crystallographic tasks while demonstrating current limitations in leveraging these insights for reliable materials generation.
Submission Number: 8
Loading