Analyzing the Capabilities of Large Language Models in Annotating Substance Use Behavior from Clinical Notes

ACL ARR 2024 June Submission3356 Authors

16 Jun 2024 (modified: 22 Jul 2024)ACL ARR 2024 June SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Abstract: Large language models (LLMs) have been trialed to annotate complex medical information. In this paper, we explore the capabilities of LLMs in annotating the substance use behavior of patients from clinical notes. We used MIMIC-SBDH data, which is based on MIMIC-3 discharge summaries, and annotated alcohol use, tobacco use, and drug use behavior into five instances: Past, Present, Never, Unsure, and nan, using the Llama3 model. The model achieved high match scores for the Past category annotation, ranging from 83.26% to 90.62%. Overall, the model accurately predicted alcohol, drug, and tobacco behaviors with respective overall accuracies of 51.70%, 31.37%, and 72.62%. However, the model performed poorly in annotating the Unsure category, with match scores ranging from 2.25% to 3.47%. Our experimentation provides information regarding performance patterns and challenges with use of LLMs for annotating complex healthcare data.
Paper Type: Short
Research Area: Information Extraction
Research Area Keywords: LLMs, Substance use, EHRs, SDOH
Contribution Types: Model analysis & interpretability, NLP engineering experiment, Data analysis
Languages Studied: English
Submission Number: 3356
Loading