everyone
since 25 Sept 2024">EveryoneRevisionsBibTeXCC BY 4.0
Radiology reports are crucial for bridging the expertise of radiologists and other clinicians. Machine Learning (ML) algorithms have been shown to be capable of performing downstream tasks, such as predicting the necessity of future follow-up procedures, based on past radiology reports. However, for clinicians to adopt the models and for radiologists to validate the results, explainability of the model is a priority. We use BERT models to classify pediatric brain tumor pathologies. These large language models (LLM) enable accurate report-level classification, without the need of costly word-level annotations. To identify and extract keywords and key-phrases related to distinct pathologies in radiology reports, we used a modified Term Frequency-Inverse Document Frequency (TF-IDF) to determine phrase importance based on prevalence and attributions scores. We achieved a multiclass Area Under Receiver Operating Characteristic Curve (AUROC) of 79.57% overall using ClinicalBERT. The per-class AUROC for the model were 86%, 71.2% and 81.5%, for ’pilocytic astrocytoma’, ’low grade astrocytoma’, and ’other’ pathologies, respectively. Our explainability analysis identified hypotonia and mesencephalon as the most important terms for ’pilocytic astrocytoma’ and ’low grade astrocytoma’, respectively.