Language Models for Fall Risk Assessment in Children with Cerebral Palsy using Electronic Medical Records

Published: 25 Sept 2024, Last Modified: 23 Oct 2024IEEE BHI'24EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Cerebral Palsy; BERT; Fall Prediction; Language Model, Electronic Medical Records.
TL;DR: Pre-trained and non-pre-trained BERT-based models were used to predict risk in children with cerebral palsy, and aggregated decision-making processes were utilized to mitigate context window challenge.
Abstract: Children with Cerebral Palsy (CP) face a heightened risk of falls, complicating treatment outcomes. Traditional manual scoring methods like the Cummings Fall Assessment Score are subjective and labor-intensive due to the diverse characterization of CP. Leveraging electronic medical records (EMRs) and advanced language models (LMs) offers a data-driven alternative for fall risk assessment. Because CP is a varied heterogeneous cohort LMs have not been thoroughly applied to assess fall risk. To address this, we utilized unstructured EMR data from 1,604 patients with CP from the Shriners Children’s Hospital Network, employing Clinical BioBERT, BioBERT, and BERTbase to predict fall risk. We explored two approaches: continued pre-training followed by fine-tuning with labeled data, and direct fine-tuning with supervised labeled data. Our findings indicate that continued pre-training does not guarantee performance improvements on downstream tasks, relative to only fine-tuning. This reduces the need for an extensive pre-training process. Only fine-tuned models were able to achieve a 0.71 F1 in prediction fall assessment risk, and 0.74 F1 scores when we excluded the borderline fall assessment score during training. The best performance achieved by Clinical BioBERT is a recall of 0.72 and a specificity of 0.80. Furthermore, CP is a complex, multifaceted condition often involving lengthy clinical notes that exceed 30,000 characters, which many LMs cannot process in a single context window. We propose a process where notes are assessed as a group of samples that fit the context window. Then a collective decision is made by probability weighted majority voting (PWMV). This approach improved model performance by 1\% to 3\% and demonstrated its effectiveness in enhancing fall risk prediction for children with CP. Our work lays the groundwork for evidence-based, data-driven treatment planning in pediatric CP clinical practice and research, significantly improving the efficiency and accuracy of CP patient care.
Track: 2. Large Language Models for biomedical and clinical research
Registration Id: MDNWJMVSVKW
Submission Number: 90
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview