Keywords: Health Informatics, EHR, Diagnosis Prediction, Healthcare Representation
TL;DR: We introduced OntoFAR, a multi-ontology fusion framework to enhance medical concept representation learning in EHR models.
Abstract: Medical ontology graphs, which typically organize and relate comprehensive medical concepts in a hierarchical structure, are able to map a rich set of external knowledge onto the specific medical codes observed in electronic health records (EHRs). Through the connectivity in ontologies, healthcare predictive models can utilize the ancestor, descendant, or sibling information to add supplementary contexts on medical codes, thereby augmenting expressiveness of EHR representations. However, existing approaches are limited by the heterogeneous isolation of different ontology systems (e.g., conditions vs. drugs), that different types of ontology concepts have to be learned individually, and only the homogeneous ontology relationships can be exploited. This limitation restricts the existing methods from fully leveraging the cross-ontology relationships which could substantially enhance healthcare representations. 
In this paper, we propose OntoFAR, a framework that fuse multiple ontology graphs, utilizing the collaboration across ontologies to enhance medical concept representation. Our method jointly represents medical concepts cross multiple ontology structures by performing message passing in two dimensions: (1) vertical propagation over levels of ontology hierarchy, and (2) horizontal propagation over co-occurring concepts in EHR visits. Additionally, OntoFAR leverages the large language models (LLMs) pre-trained on massive open world information to understand each target concept with its ontology relationships, providing enhanced embedding initialization for concepts. Through extensive experimental studies on two public datasets, MIMIC-III and MIMIC-IV, we validate the superior performance of OntoFAR over the state-of-the-art baselines. Beyond accuracy, our model also exhibits the add-on compatibility to boost existing healthcare prediction models, and demonstrate a good robustness in scenarios with limited data availability. The implementation code is available at [https://anonymous.4open.science/r/OntoFAR-35D4](https://anonymous.4open.science/r/OntoFAR-35D4)
Primary Area: other topics in machine learning (i.e., none of the above)
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 11758
Loading