BoxCare: A Box Embedding Model for Disease Representation and Diagnosis Prediction in Healthcare Data

Published: 01 Jan 2024, Last Modified: 06 Feb 2025WWW (Companion Volume) 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Diagnosis prediction is becoming crucial to develop healthcare plans for patients based on Electronic Health Records (EHRs). Existing works usually enhance diagnosis prediction via learning accurate disease representation, where many of them try to capture inclusive relations based on the hierarchical structures of existing disease ontologies such as those provided by ICD-9 codes. However, they overlook exclusive relations that can reflect different and complementary perspectives of the ICD-9 structures, and thus fail to accurately represent relations among diseases and ICD-9 codes. To this end, we propose to project disease embeddings and ICD-9 code embeddings into boxes, where a box is an axis-aligned hyperrectangle with a geometric region and two boxes can clearly "include" or "exclude" each other. Upon box embeddings, we further obtain patient embeddings via aggregating the disease representations for diagnosis prediction. Extensive experiments on two real-world EHR datasets show significant performance gains brought by our proposed framework, yielding average improvements of 6.04% for diagnosis prediction over state-of-the-art competitors.
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview