Storage Balancing in P2P Based Distributed RDF Data Stores

Maximiliano Osorio, Carlos Buil-Aranda

Jul 29, 2017 (modified: Jul 29, 2017) ISWC 2017 DeSemWeb Submission readers: everyone
  • Abstract: Centralized RDF repositories have been designed to support RDF data storage and retrieval. However, they suffer from the tradi- tional limitations of centralized approaches which are scalability and fault tolerance. Peer to Peer (P2P) networks can provide the scalabil- ity, fault-tolerance and robustness, features that the current solutions to local RDF storage do not provide which are needed by the existing Semantic Web applications. A common strategy from state-of-the-art P2P-RDF data stores is to store triples at three locations so each triple can be found using a look-up by subject, predicate, or object identifier. One major issue of this strategy is the lack of load-balancing, since occur- rences in triples are not uniformly distributed. Consequently, this issue leads an unbalance query processing load distribution and unfair storage load in the network. To solve this problem caused by load imbalance, we propose new scheme to split the data in the stressed nodes which is based in evenly distributing excess of data across neighboring nodes providing a Prefix Hash Table for fast accessing to such data. We provide an empirical evaluation of our novel approach and compare with other state of the art systems for storage balancing showing the feasibility of our approach.
  • TL;DR: solution to a storage balancing problem in RDF P2P systems
  • Submission category: Research Article
  • Keywords: RDF, P2P
  • Authorids:,