MatSeek: An Automated Knowledge-Driven Framework for Materials Research
Keywords: High Entropy Alloys, Large Language Models
TL;DR: MatSeek is an LLM-based unified framework that integrates automated structured alloy data extraction with literature-derived relational knowledge mining to guide and interpret machine-learning–driven inverse alloy design and discovery.
Abstract: The discovery of advanced alloy materials increasingly depends on reliable and interpretable knowledge extracted from the scientific literature to guide data-driven composition–property optimization. While large language models (LLMs) have enabled automated database construction, existing approaches typically separate data extraction from relational scientific knowledge mining, limiting interpretability and physical grounding in materials design.
Here we present $\textbf{MatSeek}$, an LLM-based framework that unifies structured alloy data and literature-derived scientific knowledge.
MatSeek combines an automated pipeline for building high-quality alloy databases with a knowledge extraction module capturing empirical trends, mechanistic insights, and composition design principles. This knowledge can effectively accelerate machine-learning–driven alloy discovery by constraining exploration of composition space, while providing mechanistic explanations for model predictions.
Applying MatSeek to 10,240 high-entropy alloy publications, we construct a database of 27,438 records and demonstrate efficient, interpretable identification of promising alloy compositions. MatSeek establishes a unified, literature-grounded paradigm for knowledge-driven materials discovery.
Submission Track: Full Paper
Submission Category: AI-Guided Design
Submission Number: 15
Loading