Benchmarking Multimodal Large Language Models on Electronic Structure Analysis and Interpretation

Published: 20 Sept 2025, Last Modified: 05 Nov 2025AI4Mat-NeurIPS-2025 PosterEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Multimodal, Large language model, Electronic structure, density of states
Abstract: Large language models (LLMs) are increasingly adopted in materials science, enabling automated literature mining, domain-specific scientific reasoning, and autonomous materials design. However, most existing systems are still limited to single-modality inputs, preventing the use of rich multimodal information inherent in the field. The electronic structure of materials is essential for predicting material properties, understanding their origins, and guiding new materials design, yet its integration into multimodal LLM (MLLM) frameworks has been rarely explored. Here, we present the first systematic benchmark of pre-trained MLLMs for density of states (DOS) interpretation. Using a high-fidelity dataset from first-principles calculations, we evaluate MLLMs on visual question answering and captioning tasks related to the interpretation of electronic structures, with captions scored by both human experts and MLLM-based evaluators. Our results reveal the capabilities and limitations of MLLMs in electronic structure analysis and provide a foundation for developing next-generation multimodal AI systems for materials design.
Submission Track: Benchmarking in AI for Materials Design - Short Paper
Submission Category: AI-Guided Design
Institution Location: Tokyo, Japan
AI4Mat Journal Track: Yes
Submission Number: 17
Loading