Randomized Differential Testing of RDF Stores

Rui Yang, Yingying Zheng, Lei Tang, Wensheng Dou, Wei Wang, Jun Wei

Published: 2023, Last Modified: 13 Nov 2024ICSE Companion 2023EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: As a special kind of graph database systems, RDF stores have been widely used in many applications, e.g., knowl-edge graphs and semantic web. RDF stores utilize SPARQL as their standardized query language to store and retrieve RDF graphs. Incorrect implementations of RDF stores can introduce logic bugs that cause RDF stores to return incorrect query results. These logic bugs can lead to severe consequences and are likely to go unnoticed by developers. However, no available tools can detect logic bugs in RDF stores. In this paper, we propose RD <sup>2</sup> , a Randomized Differential testing approach of RDF stores, to reveal discrepancies among RDF stores, which indicate potential logic bugs in RDF stores. The core idea of RD2 is to build an equivalent RDF graph for multiple RDF stores, and verify whether they can return the same query result for a given SPARQL query. Guided by the SPARQL syntax and the generated RDF graph, we automatically generate syntactically valid SPARQL queries, which can return non-empty query results with high probability. We further unify the formats of SPARQL query results from different RDF stores and find discrepancies among them. We evaluate RD2 on three popular and widely-used RDF stores. In total, we have detected 5 logic bugs in them. A video demonstration of RD2 is available at httos://youtu.be/da7XlsdbRR4.