High Throughput Phenotyping of Clinical Text Using Large Language Models

Daniel B. Hier; Syed Ilyas Munzir; Anne Stahlfeld; Tayo Obafemi-Ajayi; Michael Dennis Carrithers

High Throughput Phenotyping of Clinical Text Using Large Language Models

Daniel B. Hier, Syed Ilyas Munzir, Anne Stahlfeld, Tayo Obafemi-Ajayi, Michael Dennis Carrithers

Published: 25 Sept 2024, Last Modified: 21 Oct 2024IEEE BHI'24EveryoneRevisionsBibTeXCC BY 4.0

Keywords: phenotype, large language model, natural language processing, high throughput, OMIM, neurology, HPO, GPT-4

TL;DR: We demonstrate that large language models can effectively perform high throughput phenotyping of clinical text.

Abstract: High-throughput phenotyping automates the mapping of patient signs to standardized concepts, such as those in Human Phenotype Ontology (HPO), a process critical to precision medicine. We evaluated the automated phenotyping of clinical summaries from the Online Mendelian Inheritance in Man (OMIM) database using a large language model. Various APIs were used to automate text retrieval, sign identification, categorization, and normalization. GPT-4 outperformed GPT-3.5-Turbo in identifying, categorizing, and normalizing signs, achieving concordance with manual annotators comparable to concordance between manual annotators. While GPT-4 demonstrates high accuracy in sign identification and categorization, limitations remain in sign normalization, particularly in retrieving the correct HPO ID for a normalized term. Methods such as retrieval-augmented generation, changes in pre-training, and additional fine-tuning may help address these limitations. The combination of APIs with large language models presents a promising approach for high-throughput phenotyping of free text.

Track: 2. Large Language Models for biomedical and clinical research

Registration Id: ZCNMKHVQ4NH

Submission Number: 235

Loading