University of Cambridge > > Inference Group > Latency-optimal fault-tolerant replication

Latency-optimal fault-tolerant replication

Add to your list(s) Download to your calendar using vCal

If you have a question about this talk, please contact Phil Cowans.

In this talk, I will present some of my PhD research in distributed agreement protocols done at the Computer Laboratory, Cambridge.


To improve the fault-tolerance of a distributed system, a central server can be replaced with many identical replicas. Client requests are then broadcast to and executed by all these. To remain consistent, all replicas must execute client requests in the same order.

In a simple approach, clients send their requests to a distinguished replica, which then broadcasts them in some order to all others. This algorithm is fast; a client request reaches all replicas in only two communication steps. Unfortunately, it is not fault tolerant; a crash of the distinguished replica will block the entire system. Fault-tolerant solutions to this problem exist, but they all require at least three steps, even when no failures occur.

In this talk, I will present new algorithms that combine the resistance of fault-tolerant solutions with the efficiency of fault-intolerant ones. Two facts are exploited: (i) requests from different clients are usually independent, and (ii) networks normally deliver messages to all recipients in the same order. In the absence of failures, if one the above conditions holds, these algorithms require only two steps, and three otherwise.

Keywords: Atomic Broadcast, Generic Broadcast, optimistic techniques, replication, fault-tolerance, Consensus

This talk is part of the Inference Group series.

Tell a friend about this talk:

This talk is included in these lists:

Note that ex-directory lists are not shown.


© 2006-2023, University of Cambridge. Contact Us | Help and Documentation | Privacy and Publicity