Minimizing Chronic Kidney Disease (CKD) Underdiagnosis Using Machine Learning

Published: 29 Feb 2024, Last Modified: 02 May 2024AAAI 2024 SSS on Clinical FMsEveryoneRevisionsBibTeXCC BY 4.0
Track: Traditional track
Keywords: Machine Learning, Chronic Kidney Disease, CKD, Value-Based Care, MIMIC-IV, Diagnosis, Prediction, Screening, Data-driven
TL;DR: Using Machine Learning models to predict a given patient's CKD status, in hopes of helping to facilitate early screening and intervention for patients at risk of CKD.
Abstract: Chronic Kidney Disease (CKD) is a prevalent and devastating progressive disease affecting up to 14% (>35.5 million individuals) of the United States population and costing Medicare well over $64 billion annually. As many as 90% of individuals with CKD are undiagnosed, indicating the need for better tools to diagnose CKD and prevent unnoticed disease progression. However, current methods of assessing CKD have limitations regarding accessibility, practicality, and accuracy. This study seeks to address these limitations by developing a data-driven method to assess CKD risk from a large opensource database of electronic health records that has not previously been applied for CKD prediction. Machine Learning (ML) methods were used to develop a software tool to predict patient CKD status with patient-specific demographic data, vital signs, and past medical history. Of the ML models used in this study, a Random Forest Classifier had the best performance in predicting CKD diagnosis correctly with an accuracy of 0.875, an Area Under the Receiver Operating Characteristic Curve of 0.927, and an F1 score of 0.765. Our results indicate that ML-based approaches can help facilitate early screening and intervention for patients at risk of CKD. For progressive diseases like CKD that become more devastating and expensive to treat as they progress, high rates of missed diagnoses can be reduced by ML models leveraging electronic health record data.
Presentation And Attendance Policy: I have read and agree with the symposium's policy on behalf of myself and my co-authors.
Ethics Board Approval: No, our research does not involve datasets that need IRB approval or its equivalent.
Data And Code Availability: Yes, we will make data and code available upon acceptance.
Primary Area: Mechanistic ML approaches for healthcare
Student First Author: Yes, the primary author of the manuscript is a student.
Submission Number: 5
Loading