2.9 Failure Cases
So far we have been discussing different scenarios to illustrate how VRRP's switchover mechanism kicks in under different configurations. It may also be informative to look at different failure points in our context and indicate in which cases VRRP is of help and in which cases tools other than VRRP are required. To list all possible failure points, we start with our R1 router that acts as the master of the V1 virtual router. Figure 2-10 depicts the points of failure on R1.
FIGURE 2-10. Points of failure around a router
f0 indicates the failure of the R1 as a whole. Different causes may lead to such a failure: the crash of the operating system, critical hardware failures, somebody accidentally unplugging the device. Whatever the cause, with a failure of f0 category, R1 becomes totally unoperational. f1 corresponds to the failure of the IF(R1.1) interface connecting R1 to N1 network. f2 corresponds to the failure of the interface IF(R1.2) connecting R1 to N2 network.
By looking at the larger context of our branch office LAN, we observe three additional failure points. Figure 2-11 establishes such a context in our discussion with VRRP and RIP running in the local network (N1) and BGP4 covering the cloud.
FIGURE 2-11. Failure cases in context
Cloud failure in Figure 2-11 represents a complete failure of the WAN. In such cases, the branch office LAN segment becomes completely isolated. Given the current configuration, neither VRRP nor a dynamic routing protocol would be of help to restore the service. For that reason, we don't include the cloud failure case in our discussion.
f3 represents the failure of the N1 as a whole but with both R1 and R2 staying operational. Finally, f4 represents the case where the IF(R1.2) is operational but a failure occurs in the cloud, making R1 unreachable but keeping R2 connected.
To be more comprehensive, we distinguish the direction of the potential traffic flows. We use the symbol to represent the traffic flowing from the H1 host toward the N2 network; we use the symbol to represent the traffic in the opposite direction, the traffic flowing from N2 toward H1. In order to contrast the place where VRRP would be of help, and cases where a dynamic routing protocol would be required, we assume that RIP runs on N1 and BGP4 across the cloud between R1, R2, and R3. Note that by using multiple service providers, one creates redundant clouds unless they share resources. This is, of course, an additional (service, organization-oriented) dimension of the availability.
Given five failure points and 2 traffic flow directions, all in all we have ten cases to study. We expect some of them to be equivalent from the perspective of our discussionrelevance of VRRP to the specific case.
The study of Table 2-3 makes clear that VRRP is designed for the protection of a local interface for an outgoing traffic. The switchover kicks in when IF(R1.1) becomes unoperational through f0 or f1. The f4 case is a local catastrophe. Neither VRRP nor dynamic routing protocols are of help in this case. This predicament reminds us that there are other single points of failure in the system that cannot be fixed by just relying on protocols. In cases where IF(R1.1) is not impacted, VRRP is not relevant. To preserve availability in such cases, dynamic routing protocols are handy. Two observations may be appropriate at this juncture: First, dynamic routing protocols are also designed to preserve network availability. This is actually the topic of our discussion in Chapter 1 under the title Availability at Layer 2 and Layer 3. Second, VRRP is not a routing protocol. It is a first hop router redundancy, failover protocol, designed for cases in which hosts on a LAN do not run dynamic routing protocols.
TABLE 2-3. Possible Failure Cases in a Context
FAILURE |
DIRECTION |
EFFECTS |
f0 |
The master is unoperational. R2 becomes the new master, starts forwarding H1 traffic toward N1. VRRP helps in this case. |
|
The master is unoperational. R2 becomes the new master. VRRP is not of help in this case. R3 detects the failure of R1 through BGP4 and reroutes the traffic to R2. |
||
f1 |
The master stays operational, no switch over take place. But detecting the failure of IF(R1.2), R1 through RIP reroutes the traffic to R2. |
|
The master stays operational. But R3 detects the failure of R1 through BGP4 and reroutes the traffic to R2. The effect same as f0 . |
||
f2 |
The backup in R2 recognizes the unavailabity of the master and performs the switch over. VRRP helps in this case. |
|
R3 detects N1 as unreachable through BGP4 and reroutes the traffic to R2. |
||
f3 |
N1 network is fully isolated. Neither VRRP nor RIP can be of help in this case. |
|
N1 network is fully isolated. Neither VRRP nor BGP4 can be of help in this case. |
||
f4 |
R1 detects the link failure. RIP reroutes the traffic to R2. |
|
R3 detects the failure. and through BGP4 reroutes the traffic to R2. |
This fact reminds us that dynamic routing protocols contribute to the preservation of network availability.