Visualizing Wikidata: the RAWGraphs 2.0 approach
Confirmation: I have read and agree with the workshop's policy on behalf of myself and my co-authors.
Keywords: Wikidata, SPARQL, Data Visualization, RAWGraphs 2.0, visual model
Abstract: Wikidata represents an incredible revolution in open knowledge curation, yet its complexity and structure often limit its visibility and usage—particularly when it comes to data visualization. Transforming Wikidata’s SPARQL query outputs into meaningful, understandable graphics can be challenging due to the intricacies of data preparation and the need for suitable visualization models.
In this proposal, we introduce RAWGraphs 2.0, a tool designed to bridge the gap between structured data and visual representations. Building on the seminal work by Jacques Bertin in 1968 and subsequent research, RAWGraphs 2.0 leverages a constructive, template-based approach. It relies on the concept of a “visualization template”: each template (e.g., bar chart, pie chart, Sankey diagram) identifies the visual variables available (such as color, size, shape, order) and the types of data dimensions that can be used to control them. This approach encourages a modular, extensible framework for visualization, currently encompassing 22 templates, and is readily adaptable to incorporate new ones.
We will discuss how RAWGraphs 2.0’s open and customizable architecture integrates seamlessly with Wikidata SPARQL queries. By doing so, it supports a consistent, transparent pipeline—preserving data provenance and the transformation process. The result is an environment where visualizations remain openly editable, fostering collaborative refinement and adaptation to evolving data or research interests.
This proposal first contextualizes the current state of data visualization on Wikipedia and related Wikimedia platforms. We examine both template-based and Wikimedia Commons–hosted visualizations, comparing their strengths and limitations, especially concerning traceability, data provenance, and adaptability.
We then introduce the RAWGraphs 2.0 conceptual framework and detail how SPARQL outputs from Wikidata can be seamlessly prepared to fit this paradigm. We demonstrate how to align data queries with visualization templates, enabling researchers, developers, and Wikimedia contributors to quickly prototype and iterate on a broad range of graphical forms. From there, we outline common challenges encountered when working directly with the results of Wikidata queries, and propose good practices for their composition.
Finally, we present a series of “visualization template families” as practical exemplars of the potential for Wikidata-driven representations. These include templates for time series, correlation matrices, hierarchical layouts, proportional comparisons, networks, and distribution plots. By mapping Wikidata results into these forms, we reveal the versatility of RAWGraphs 2.0 in handling diverse data scenarios and research questions.
Through the integration of Wikidata, SPARQL, and RAWGraphs 2.0, this proposal emphasizes the potential of a flexible, open, and transparent visualization ecosystem. We aim to inspire the Wikidata community and researchers to adopt more intuitive, reproducible, updatable, and visually engaging methods of exploring and understanding open knowledge data.
Format: Paper (20 minutes presentation)
Submission Number: 31
Loading