Semantic grounding of LLMs using knowledge graphs for query reformulation in medical information retrieval

Antonela Tommasel, Ira Assent

Published: 2024, Last Modified: 20 May 2025IEEE Big Data 2024EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The widespread adoption of electronic health records has generated a vast amount of patient-related data, mostly presented in the form of unstructured text, which could be used for document retrieval. However, querying these texts in full could present challenges due to their unstructured and lengthy nature, as they may contain noise or irrelevant terms that can interfere with the retrieval process. Recently, large language models (LLMs) have revolutionized natural language processing tasks. However, despite their promising capabilities, their use in the medical domain has raised concerns due to their lack of understanding, hallucinations, and reliance on outdated knowledge. To address these concerns, we evaluate a Retrieval Augmented Generation (RAG) approach that integrates medical knowledge graphs with LLMs to support query refinement in medical document retrieval tasks. Our initial findings from experiments using two benchmark TREC datasets demonstrate that knowledge graphs can effectively ground LLMs in the medical domain.