Apache ShardingSphere: A Holistic and Pluggable Platform for Data ShardingDownload PDFOpen Website

2022 (modified: 19 Nov 2022)ICDE 2022Readers: Everyone
Abstract: Traditional relational databases are nowadays over-whelmed by the increasing data volume and concurrent access. NoSQL databases can manage large-scale data, but most of them do not support complete transactions and standard SQL languages. NewSQL is proposed for both high scalability and transactional properties with SQL languages support. One type of NewSQL builds distributed systems from scratch, which is too radical for some critical applications. The other type of NewSQL, i.e., data sharding among relational databases, is a better option for these scenarios. This paper presents Apache ShardingSphere, the first top-level open-source platform for data sharding in Apache, which enables developers to use sharded databases like one database. Specifically Apache, ShardingSphere integrates six databases and designs and implements a complete SQL engine to route requests correctly and intelligently. Additionally it encapsulates three types of distributed transactions and provides two adaptors for different scenarios. Moreover it proposes a novel AutoTable strategy and a query language i.e DistSQL allowing database maintainers to easily configure the sharded databases. Further-more it provides many other pluggable features to better shard data. Extensive experiments are conducted using two famous benchmarking tools proving that Apache ShardingSphere is more efficient than eight state-of-the-art systems in our settings. All experimental source codes are publicly released. More than 170 companies are currently using Apache ShardingSphere.
0 Replies

Loading