Dealing with the Leader Failing
That’s what happens most of the time—when all goes well. But the point of getting a distributed system to work is what happens when things don’t go well. Here’s a different case. Neptune receives B1 and sends out its replication messages. But it is unable to contact Saturn. It could replicate only to Jupiter. At this point it loses all connectivity with the other two nodes. This leaves Jupiter and Saturn connected together, but disconnected from their leader (Figure 2.9).
Figure 2.9 Leader failure
So now what do these nodes do? For a start, how do they even find out what’s broken? Neptune can’t send Jupiter and Saturn a message saying the connection is broken . . . because the connection is broken. Nodes need a way to find out when connections to their colleagues break. They do this with a HeartBeat—or, more strictly, with the absence of a heartbeat.
A heartbeat is a regular message sent between nodes, just to indicate they are alive and communicating. Heartbeat does not necessarily require a distinct message type. When cluster nodes are already engaged in communication, such as when replicating data, the existing messages can serve the purpose of heartbeats. If Saturn doesn’t receive a heartbeat from Neptune for a period of time, Saturn marks Neptune as down. Since Neptune is the leader, Saturn now calls for an election for a new leader (Figure 2.10).
Figure 2.10 Leader sending heartbeats
The heartbeat gives us a way to know that Neptune has disconnected, so now we can turn to the problem of how to deal with Bob’s request. We need to ensure that once Neptune has confirmed the update to Bob, even if Neptune crashes, the followers can elect a new leader with B1 applied to their data. But we also need to deal with more complication than that, as Neptune may have received multiple messages. Consider the case where there are messages from both Alice (A1) and Bob (B1) handled by Neptune. Neptune successfully replicates them both with Jupiter but is unable to contact Saturn before it crashes, as shown in Figure 2.11.
Figure 2.11 Leader failure—incomplete replication
In this case, how do Jupiter and Saturn deal with the fact that they have different states?
The answer is essentially the same as discussed earlier for resilience on a single node. If Neptune writes changes into a Write-Ahead Log and treats replication as copying those log entries to its followers, then its followers will be able to figure out what the correct state is by examining the log entries (Figure 2.12).
Figure 2.12 Leader failure—incomplete replication—using log
When Jupiter and Saturn elect a new leader, they can tell that Jupiter’s log has later index entries, and Saturn can apply those log entries to itself to gain a consistent state with Jupiter.
This is also why Neptune can reply to Bob that the update was accepted, even though it hadn’t heard back from Saturn. As long as a Majority Quorum—that is, a majority—of the nodes in the cluster have successfully replicated the log messages, Neptune can be sure that the cluster will maintain consistency even if the leader disconnects.