Med42 - Evaluating Fine-Tuning Strategies for Medical LLMs: Full-Parameter vs. Parameter-Efficient Approaches

Published: 29 Feb 2024, Last Modified: 02 May 2024AAAI 2024 SSS on Clinical FMsEveryoneRevisionsBibTeXCC BY 4.0
Track: Traditional track
Keywords: LLMs, Clinical, Fine-tuning, Evaluation
TL;DR: Comparison between parameter-efficient and full-parameter fine-tuning strategies to train a large clinical language model.
Abstract: This study presents a comprehensive analysis and comparison of two predominant fine-tuning methodologies -- full-parameter fine-tuning and parameter-efficient tuning -- within the context of medical Large Language Models (LLMs). We developed and refined a series of LLMs, based on the Llama-2 architecture, specifically designed to enhance medical knowledge retrieval, reasoning, and question answering capabilities. Our experiments systematically evaluate the effectiveness of these tuning strategies across various well-known medical benchmarks. Notably, our medical LLM showed an accuracy level of 72% on the US Medical Licensing Examination (USMLE) datasets, setting a new standard in performance for openly available medical LLMs. Through this comparative analysis, we aim to identify the most effective and efficient method for fine-tuning LLMs in the medical domain, thereby contributing significantly to the advancement of AI-driven healthcare applications.
Presentation And Attendance Policy: I have read and agree with the symposium's policy on behalf of myself and my co-authors.
Ethics Board Approval: No, our research does not involve datasets that need IRB approval or its equivalent.
Data And Code Availability: Yes, we will make data and code available upon acceptance.
Primary Area: Clinical foundation models
Student First Author: No, the primary author of the manuscript is NOT a student.
Submission Number: 7
Loading