PoweRGen: A power-law based generator of RDFS schemas

Published: 01 Jan 2012, Last Modified: 06 Dec 2024Inf. Syst. 2012EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: As the amount of RDF datasets available on the Web has grown significantly over the last years, scalability and performance of Semantic Web (SW<math><mi mathvariant="sans-serif" is="true">SW</mi></math>) systems are gaining importance. Current RDF<math><mi mathvariant="sans-serif" is="true">RDF</mi></math> benchmarking efforts either consider schema-less RDF<math><mi mathvariant="sans-serif" is="true">RDF</mi></math> datasets or rely on fixed RDFS<math><mi mathvariant="sans-serif" is="true">RDFS</mi></math> schemas. In this paper, we present the first RDFS<math><mi mathvariant="sans-serif" is="true">RDFS</mi></math> schema generator, termed PoweRGen, which takes into account the features exhibited by real SW<math><mi mathvariant="sans-serif" is="true">SW</mi></math> schemas. It considers the power-law<math><mi mathvariant="sans-serif" is="true">power</mi><mi mathvariant="normal" is="true">-</mi><mi mathvariant="sans-serif" is="true">law</mi></math> functions involved in (a) the combined in- and out-degree distribution of the property graph (which captures the domains and ranges of the properties defined in a schema) and (b) the out-degree distribution of the transitive closure (TC<math><mi mathvariant="sans-serif" is="true">TC</mi></math>) of the subsumption graph (which essentially captures the class hierarchy). The synthetic schemas generated by PoweRGen respect the power-law<math><mi mathvariant="sans-serif" is="true">power</mi><mi mathvariant="normal" is="true">-</mi><mi mathvariant="sans-serif" is="true">law</mi></math> functions given as input with an accuracy ranging between 89 and 96%, as well as, various morphological characteristics regarding the subsumption hierarchy depth, structure, etc.
Loading