Keywords: Cluster-sending, probabilistic protocols, Byzantine clusters, message sending
TL;DR: Solution to the cluster-sending problem (communication between fault-tolerant clusters) that has expected-constant cost, independent of the number of participants in the involved clusters.
Abstract: Traditional resilient systems operate on fully-replicated fault-tolerant clusters, which limits their scalability and performance. One way to make the step towards resilient high-performance systems that can deal with huge workloads is by enabling independent fault-tolerant clusters to efficiently communicate and cooperate with each other, as this also enables the usage of high-performance techniques such as sharding. Recently, such inter-cluster communication was formalized as the Byzantine cluster-sending problem. Unfortunately, existing worst-case optimal protocols for cluster-sending all have linear complexity in the size of the clusters involved. In this paper, we propose probabilistic cluster-sending techniques as a solution for the cluster-sending problem with only an expected constant message complexity, this independent of the size of the clusters involved and this even in the presence of highly unreliable communication. Depending on the robustness of the clusters involved, our techniques require only two-to-four message round-trips (without communication failures). Furthermore, our protocols can support worst-case linear communication between clusters. Finally, we have put our techniques to the test in an in-depth experimental evaluation that further underlines the exceptional low expected costs of our techniques in comparison with other protocols. As such, our work provides a strong foundation for the further development of resilient high-performance systems.
Area: Distributed Consensus
Previous Version: https://openreview.net/forum?id=8xwPz-Vx8sq¬eId=Fkm3pdOaT0t