Augmenting Black-box LLMs with Medical Textbooks for Biomedical Question Answering

TMLR Paper2598 Authors

29 Apr 2024 (modified: 17 Sept 2024)Withdrawn by AuthorsEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large language models (LLMs) such as ChatGPT have demonstrated impressive abilities in generating responses based on human instructions. However, their use in the medical field can be challenging due to their lack of specific, in-depth knowledge. A general solution to integrate external knowledge into LLMs is through a retrieval-augmented generation framework. Yet, the choice of corpus and the design of the retrieval pipeline for medical question-answering tasks remain under-explored. In this study, we present a system called LLMs Augmented with Medical Textbooks (LLM-AMT) designed to enhance the proficiency of LLMs in specialized domains. LLM-AMT integrates authoritative medical textbooks into the LLMs' framework using plug-and-play modules. These modules include a Query Augmenter, a Hybrid Textbook Retriever, and a Knowledge Self-Refiner. Together, they incorporate authoritative medical knowledge. Additionally, an LLM Reader aids in contextual understanding. Our experimental results on three medical QA tasks demonstrate that LLM-AMT significantly improves response quality, with accuracy gains ranging from 11.6% to 16.6%. Notably, with GPT-4-Turbo as the base model, LLM-AMT outperforms the specialized Med-PaLM 2 model pre-trained on a massive amount of medical corpus by 2-3%. We found that despite being 100× smaller in size, medical textbooks as a retrieval corpus is a more effective knowledge database than Wikipedia in the medical domain, boosting performance by 7.8%-13.7%. We will open-source the code for this work.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Lei_Li11
Submission Number: 2598
Loading