From Federal Heterogeneity to Reproducible Analytics: A Provenance-Aware Knowledge Graph for Cross-Portal Comparability of German Open Government Data

Published: 09 Apr 2026, Last Modified: 16 Apr 2026KGCW 2026EveryoneRevisionsBibTeXCC BY 4.0
Keywords: Knowledge Graph Construction, Open Government Data, Dataset categorization, Provenance, Metadata, RML
TL;DR: A provenance-aware and temporally scoped knowledge graph for German Open Government Data enables reproducible cross-state analyses by linking evolving category schemes and organizational structures to persistent identifiers and mapping provenance.
Abstract: Cross-portal analyses of open government data in Germany are hindered by federal heterogeneity. Each of Germany's 16 federal states operates distinct publication setups, organizational structures, and topic taxonomies, and these can shift over time due to administrative reforms and political terms. Consequently, geographical and thematic questions, such as the identification of the ministries that published which types of datasets under which category labels in which period, remain challenging to answer reproducibly. This is particularly evident when various political stakeholders undertake a comparative analysis of their ministry or subject area against analogous ministries in federal states. The position and vision paper proposes a knowledge graph that (i) captures portal- and state-specific category schemes as versioned resources, (ii) links publishers to persistent authority identifiers, and (iii) records extraction and mapping provenance for auditability. Rather than proposing a new foundational ontology or mapping language, the contribution is a reusable construction pattern for provenance-aware and temporally scoped knowledge graph construction in federated open data settings. The approach is aligned with the Knowledge Graph Construction Workshop's focus on mapping-based knowledge graph construction, workflows, and provenance-aware pipelines.
Submission Number: 4
Loading