Abstract: State machine replication (SMR) uses consensus as its core component for reaching agreement among a group of processes, in order to provide fault-tolerant services. Most SMR protocols, such as Paxos and Raft, are designed in the partial synchrony model. Partially synchronous protocols rely on timing assumptions to elect a special role (such as the leader), which may become the performance bottleneck under a heavy workload. From an engineering perspective, partially synchronous protocols have to wait for a pre-defined period of time and implement a (complicated) failover mechanism in order to replace the faulty leader. In contrast, asynchronous protocols are immune to such problems.This paper presents Bandle, a simple and highly efficient asynchronous SMR protocol. Instead of electing a special role, Bandle evenly assigns sequence numbers to each process and proceeds in a leaderless manner. We further propose a binary agreement protocol, referred to as FlashBA, which decides whether a given proposal can be committed. FlashBA is inspired by Ben-Or's randomized algorithm but leverages a promise mechanism to achieve optimal latency (i.e., one message delay in the best case). An empirical study on the Amazon EC2 platform shows that Bandle delivers exceptional performance when deployed within a data center and across the globe.
Loading