Instruction Tuning Large Language Models to Understand Electronic Health Records

Zhenbang Wu; Anant Dadu; Michael Nalls; Faraz Faghri; Jimeng Sun

Instruction Tuning Large Language Models to Understand Electronic Health Records

Zhenbang Wu, Anant Dadu, Michael Nalls, Faraz Faghri, Jimeng Sun

Published: 26 Sept 2024, Last Modified: 13 Nov 2024NeurIPS 2024 Track Datasets and Benchmarks SpotlightEveryoneRevisionsBibTeXCC BY 4.0

Keywords: electronic health record, clinical predictive modeling, instruction following, large language models, foundation models

Abstract: Large language models (LLMs) have shown impressive capabilities in solving a wide range of tasks based on human instructions. However, developing a conversational AI assistant for electronic health record (EHR) data remains challenging due to (1) the lack of large-scale instruction-following datasets and (2) the limitations of existing model architectures in handling complex and heterogeneous EHR data. In this paper, we introduce MIMIC-Instr, a dataset comprising over 400K open-ended instruction-following examples derived from the MIMIC-IV EHR database. This dataset covers various topics and is suitable for instruction-tuning general-purpose LLMs for diverse clinical use cases. Additionally, we propose Llemr, a general framework that enables LLMs to process and interpret EHRs with complex data structures. Llemr demonstrates competitive performance in answering a wide range of patient-related questions based on EHR data. Furthermore, our evaluations on clinical predictive modeling benchmarks reveal that the fine-tuned Llemr achieves performance comparable to state-of-the-art (SOTA) baselines using curated features. The dataset and code are available at \url{https://github.com/zzachw/llemr}.

Supplementary Material: pdf

Submission Number: 1434

Loading