KIF: A Framework for Virtual Integration of Heterogeneous Knowledge Bases using Wikidata

Published: 01 Jan 2024, Last Modified: 16 May 2025CoRR 2024EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: We present a Wikidata-based framework, called KIF, for virtually integrating heterogeneous knowledge sources. KIF is written in Python and is released as open-source. It leverages Wikidata's data model and vocabulary plus user-defined mappings to construct a unified view of the underlying sources while keeping track of the context and provenance of their statements. The underlying sources can be triplestores, relational databases, CSV files, etc., which may or may not use the vocabulary and RDF encoding of Wikidata. The end result is a virtual knowledge base which behaves like an "extended Wikidata" and which can be queried using a simple but expressive pattern language, defined in terms of Wikidata's data model. In this paper, we present the design and implementation of KIF, discuss how we have used it to solve a real integration problem in the domain of chemistry (involving Wikidata, PubChem, and IBM CIRCA), and present experimental results on the performance and overhead of KIF
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview