Improving Interpretability of Radiology Report-based Pediatric Brain Tumor Pathology Classification and Key-phrases Extraction Using Large Language Models

Chen Zhao; Kareem Kudus; Sara Ketabi; Khashayar Namdar; Matthias W. Wagner; Birgit Betina Ertl-Wagner; Farzad Khalvati

Improving Interpretability of Radiology Report-based Pediatric Brain Tumor Pathology Classification and Key-phrases Extraction Using Large Language Models

Chen Zhao, Kareem Kudus, Sara Ketabi, Khashayar Namdar, Matthias W. Wagner, Birgit Betina Ertl-Wagner, Farzad Khalvati

Published: 25 Sept 2024, Last Modified: 23 Oct 2024IEEE BHI'24EveryoneRevisionsBibTeXCC BY 4.0

Keywords: BERT, deep learning, keywords extraction; interpretability, key-phrases extraction, MRI, NLP, radiology reports

Abstract: Radiology reports are crucial for bridging the expertise of radiologists and other clinicians. Machine Learning (ML) algorithms have been shown to be capable of performing downstream tasks, such as predicting the necessity of future follow-up procedures, based on past radiology reports. However, for clinicians to adopt the models and for radiologists to validate the results, explainability of the model is a priority. We use BERT models to classify pediatric brain tumor pathologies. These large language models (LLM) enable accurate report-level classification, without the need of costly word-level annotations. To identify and extract keywords and key-phrases related to distinct pathologies in radiology reports, we used a modified Term Frequency-Inverse Document Frequency (TF-IDF) to determine phrase importance based on prevalence and attributions scores. We achieved a multiclass Area Under Receiver Operating Characteristic Curve (AUROC) of 79.57% overall using ClinicalBERT. The per-class AUROC for the model were 86%, 71.2% and 81.5%, for ’pilocytic astrocytoma’, ’low grade astrocytoma’, and ’other’ pathologies, respectively. Our explainability analysis identified hypotonia and mesencephalon as the most important terms for ’pilocytic astrocytoma’ and ’low grade astrocytoma’, respectively.

Track: 2. Large Language Models for biomedical and clinical research

Registration Id: PTNPB94WTNZ

Submission Number: 143

Loading