Keywords: EHR Time Series, Transferability, Continual Learning, Domain Incremental Learning
Abstract: In recent years, machine learning has made significant progress in clinical outcome prediction, demonstrating increasingly accurate results. However, the substantial resources required for hospitals to train these models, such as data collection, labeling, and computational power, limit the feasibility for smaller hospitals to develop their own models. An alternative approach involves transferring a machine learning model trained by a large hospital to smaller hospitals, allowing them to fine-tune the model on their specific patient data.
However, these models are often trained and validated on data from a single hospital, raising concerns about their generalizability to new data. Our research shows that there are notable differences in measurement distributions and frequencies across various regions in the United States. To address this, we propose a benchmark that tests a machine learning model's ability to transfer from a source domain to different regions across the country. This benchmark assesses a model's capacity to learn meaningful information about each new domain while retaining key features from the original domain.
Using this benchmark, we frame the transfer of a machine learning model from one region to another as a domain incremental learning problem. While the task of patient outcome prediction remains the same, the input data distribution varies, necessitating a model that can effectively manage these shifts. We evaluate two popular domain incremental learning methods: data replay, which stores examples from previous data sources for fine-tuning on the current source, and Elastic Weight Consolidation (EWC), a model parameter regularization method that maintains features important for both data sources.
Finally, we propose a new domain incremental learning method that combines EWC and data replay with the ability to adjust the number of updates utilizing data from previous sources. Our results show that this proposed method outperforms EWC and data replay alone. We also highlight specific shortcomings related to model transferability in the clinical setting, underscoring the need for further research and development in this area.
Track: 11. General Track
Registration Id: R6N7MDCYMRY
Submission Number: 222
Loading