G-RAG: Knowledge Expansion in Material Science

NeurIPS 2024 Workshop MusIML Submission11 Authors

15 Nov 2024 (modified: 16 Nov 2024)NeurIPS 2024 Workshop MusIML SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: LLM, Vector/Naive RAG, Graph RAG, Graph Database, Entity Linking, Relation Extraction, Material Science, High-Entropy Alloy, Document Parsing
TL;DR: Improvement of Graph RAG with adding external knowledge base for Material Science.
Abstract: In the field of Material Science, effective information retrieval systems are essential for facilitating research. Traditional Retrieval-Augmented Generation (RAG) approaches in Large Language Models (LLMs) often encounter challenges such as outdated information, hallucinations, limited interpretability due to context constraints, and inaccurate retrieval. To address these issues, Graph RAG integrates graph databases to enhance the retrieval process. Our proposed method processes Material Science documents by extracting key entities (referred to as MatIDs) from sentences, which are then utilized to query external Wikipedia knowledge bases (KBs) for additional relevant information. We implement an agent-based parsing technique to achieve a more detailed representation of the documents. Our improved version of Graph RAG called G-RAG further leverages a graph database to capture relationships between these entities, improving both retrieval accuracy and contextual understanding. This enhanced approach demonstrates significant improvements in performance for domains that require precise information retrieval, such as Material Science.
Submission Number: 11
Loading