Concept-based Biomedical Text Expansion via Large Language Models

Published: 01 Jan 2024, Last Modified: 19 Feb 2025BIBM 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Concept-based approaches have emerged as a promising solution to enhance search accuracy in biomedical information retrieval (BIR). These approaches aim to bridge the semantic gap between user queries and document content by mapping both to a shared conceptual space. However, existing methods often struggle with limited concept coverage and inaccurate concept mapping. Recent advancements in large language models (LLMs) have demonstrated their extensive concept coverage and ability to accurately connect text with relevant concepts, showing potential in addressing these challenges. Motivated by these findings, we propose Concept-based Biomedical Text Expansion (CBTE), a novel method that leverages LLMs to identify relevant concepts in both queries and documents. CBTE then uses these concepts to expand queries and documents. The relevance between expanded queries and documents is evaluated using the BM25 sparse retrieval framework. To validate CBTE’s effectiveness, we conducted comprehensive experiments on two well-known biomedical datasets, NFCorpus and Trec-Covid. The results indicate that CBTE significantly outperforms existing baselines.
Loading