Keywords: knowledge graphs, clustering, medical knowledge
TL;DR: Open Drug Knowledge Graph proposes a knowledge-based method to harmonize four heterogeneous sources into a single, comprehensive drug-centric knowledge graph.
Abstract: Automatic knowledge-based systems can assist medical professionals to make more informed recommendations and decisions. Unfortunately, as no comprehensive knowledge base (with both medical and non-medical) knowledge exists today, much manual effort is required to consolidate knowledge across sources, that are heterogeneous in content and formats. In this paper, we propose a knowledge-based method that aims to harmonize four such heterogeneous sources into a single drug-centric knowledge graph. The graph is based on the drugs found in Wikidata, and extended with specialized sources through an extraction and transformation pipeline, including data acquisition, entity resolution, and semantic modeling. Our analyses show that the resulting graph and its embeddings can capture drug similarity through their associated symptoms, and thus address common, knowledge-intensive medical search scenarios. As such, it holds the promise to be adapted for drug recommendation in the future. Given the modular setup of our method, new sources can be included to accommodate healthcare object use cases, relating to diagnoses and claims. We make the resulting knowledge source available in both relational database and property graph format.