Are LLMs Ready for Real-World Materials Discovery?

Published: 08 Jul 2024, Last Modified: 23 Jul 2024AI4Mat-Vienna-2024 PosterEveryoneRevisionsBibTeXCC BY 4.0
Submission Track: Full Paper
Submission Category: All of the above
Keywords: LLMs, materials discovery
TL;DR: We describe current shortcomings of LLMs to be useful for materials discovery and outline a roadmap to bridge the gap.
Abstract: Large Language Models (LLMs) create exciting possibilities for powerful language processing tools to accelerate research in materials science. While LLMs have great potential to accelerate materials understanding and discovery, they currently fall short in being practical materials science tools. In this paper, we show relevant failure cases of LLMs in materials science that reveal current limitations of LLMs related to comprehending and reasoning over complex, interconnected materials science knowledge. Given those shortcomings, we outline a framework for developing Materials Science LLMs (MatSci-LLMs) that are grounded in materials science knowledge and hypothesis generation followed by hypothesis testing. The path to attaining performant MatSci-LLMs rests in large part on building high-quality, multi-modal datasets sourced from scientific literature where various information extraction challenges persist. As such, we describe key materials science information extraction challenges which need to be overcome in order to build large-scale, multi-modal datasets that capture valuable materials science knowledge. Finally, we outline a roadmap for applying MatSci-LLMs for real-world materials discovery through six interacting steps: 1. Materials Query; 2. Data Retrieval; 3. Materials Design; 4. Insilico Evaluation; 5. Experiment Planning; 6. Experiment Execution.
Submission Number: 18
Loading