L^2M^3OF: A Large Language Multimodal Model for Metal-Organic Frameworks

15 Sept 2025 (modified: 11 Feb 2026)Submitted to ICLR 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Crystal Material, Foundation Models for Science
Abstract: Large language models (LLMs) have demonstrated remarkable reasoning capabilities across diverse natural language tasks. However, comparable breakthroughs in scientific discovery are more limited, because understanding complex physical phenomena demands multifaceted representations far beyond language alone. A compelling example is the design of functional materials such as metal-organic frameworks (MOFs) — critical for a range of impactful applications like carbon capture and hydrogen storage. Navigating their vast and intricate design space in language-based representations interpretable by LLMs is challenging due to the numerous possible three-dimensional atomic arrangements and strict reticular rules of coordination geometry and topology. Despite promising early results in LLM-assisted discovery for simpler materials systems, MOF design remains heavily reliant on tacit human expertise rarely codified in textual information alone. To overcome this barrier, we introduce L^2M^3OF, the first multimodal LLM for MOFs. L^2M^3OF integrates crystal representation learning with language understanding to process structural, textual, and knowledge modalities jointly. L^2M^3OF employs a pre-trained crystal encoder with a lightweight projection layer to compress structural information into a token space, enabling efficient alignment with language instructions. To facilitate training and evaluation, we curate a structure–property–knowledge database of crystalline materials and benchmark L^2M^3OF against state-of-the-art (SOTA) closed-source LLMs such as GPT-5, Gemini-2.5-Pro, and DeepSeek-R1. Experiments show that L^2M^3OF outperforms leading text-based closed-source LLMs in property prediction and knowledge generation tasks, despite using far fewer parameters. These results highlight the importance of multimodal approaches for porous crystalline material understanding and establish L^2M^3OF as a foundation for next-generation AI systems in materials discovery.
Primary Area: applications to physical sciences (physics, chemistry, biology, etc.)
Submission Number: 6219
Loading