Automated clinical coding using off-the-shelf large language models

Published: 27 Oct 2023, Last Modified: 10 Nov 2023DGM4H NeurIPS 2023 PosterEveryoneRevisionsBibTeX
Keywords: large language models, clinical coding, multi-label classification, zero-shot prediction
TL;DR: We present the first method using generative LLMs to perform automated clinical coding; our method does not require fine-tuning and gives good zero-shot performance.
Abstract: The task of assigning diagnostic ICD codes to patient hospital admissions is typically performed by expert human coders. Efforts towards automated ICD coding are dominated by supervised deep learning models. However, difficulties in learning to predict the large number of rare codes remain a barrier to adoption in clinical practice. In this work, we leverage off-the-shelf pre-trained generative large language models (LLMs) to develop a practical solution that is suitable for zero-shot and few-shot code assignment, with no need for further task-specific training. Unsupervised pre-training alone does not guarantee precise knowledge of the ICD ontology and specialist clinical coding task, therefore we frame the task as information extraction, providing a description of each coded concept and asking the model to retrieve related mentions. For efficiency, rather than iterating over all codes, we leverage the hierarchical nature of the ICD ontology to sparsely search for relevant codes. We validate our method using Llama-2, GPT-3.5 and GPT-4 on the CodiEsp dataset of ICD-coded clinical case documents. Our tree-search method achieves state-of-the-art performance on rarer classes, achieving the best macro-F1 of 0.225, whilst achieving slightly lower micro-F1 of 0.157, compared to 0.216 and 0.219 respectively from PLM-ICD. To the best of our knowledge, this is the first method for automated clinical coding requiring no task-specific learning.
Submission Number: 30