- Transport Network Failures and Their Impacts
- Survivability Principles from the Ground Up
- Physical Layer Survivability Measures
- Survivability at the Transmission System Layer
- Logical Layer Survivability Schemes
- Service Layer Survivability Schemes
- Comparative Advantages of Different Layers for Survivability
- Measures of Outage and Survivability Performance
- Measures of Network Survivability
- Restorability
- Reliability
- Availability
- Network Reliability
- Expected Loss of Traffic and of Connectivity
3.3 Physical Layer Survivability Measures
The physical layer, sometimes called Layer 0, is the infrastructure of physical resources on which the network is based: buildings, rights-of-way, cable ducts, cables, underground vaults, and so on. In this layer, survivability considerations are primarily aimed at physical protection of signal-bearing assets and ensuring that the physical layer topology has a basic spatial diversity so as to enable higher layer survivability techniques to function.
3.3.1 Physical Protection Methods
A number of standard practices enhance the physical protection of cables. In metropolitan areas PVC tubing is generally used as a duct structure to give cables a fairly high degree of protection, albeit at high cost. Outside of built up areas, fiber cables are usually direct-buried (without the PVC ducts), at 1.5 to 2 meters depth, and a brightly colored marker ribbon is buried a foot above the cable as a warning marker. There is usually a message such as "Warning: Optical Cable—STOP" on the tape. It is standard practice to also mark all subsurface cable routes with above-ground signs, but these can be difficult to maintain over the years. In some cases where the water table is high, buried cables have actually been found to move sideways up to several meters from their marked positions on the surface. "Call before you dig" programs are often made mandatory by legislation to reduce dig-ups. And hand digging is required to locate the cable after nearing its expected depth within two feet. Locating cables from the surface has to be done promptly and this is an area where geographical information systems can improve the operator's on-line knowledge about where they have buried structures (and other network assets). Cable locating is also facilitated by application of a cable-finding tone to the cable, assuming (as is usual) that a metallic strength member or copper pairs for supervisory and power-feeding are present. Measures against rodents include climbing shields on poles and cable sheath materials designed to repel rodents from chewing. On undersea cables the greatest hazard is from ship anchors and fishnets dragging on the continental shelf portions. Extremely heavily armored steel outer layers have been developed for use on these sections as well as methods for undersea trenching into the sea floor until it leaves the continental shelf. Beyond the continental shelf cables are far less heavily armored and lay on the sea floor directly. Interestingly, the main physical hazard to such deep sea cables appears to be from shark bites. Several transoceanic cables have been damaged by sharks which seem to be attracted to the magnetic fields surrounding the cable from power-feeding currents. Thus, even in this one case where it seems we might not have to plan for cable cut, it is not so.
Underground cables are either gel-filled to prevent ingress of water or in the past have been air pressurized from the cable vault. An advantage of cable pressurization is that with an intact cable sheath there will normally be no flow. Any loss of sheath integrity is then detected automatically when the pressurization system starts showing a significant flow rate. In addition to the main hazards to "aerial" cables of vehicles, tree falls and ice storms mentioned by Crawford, vandalism and gunshots are another physical hazard to cables. A problem in some developing countries is that aerial fiber optic cables are mistaken for copper cables and pulled down by those who steal copper cable for salvage or resale value. Overall, however, aerial cables sustain only about one third as many cable cuts as do buried cables from dig-ups. And (ironically), while buried cable routes are well marked on the surface, experience with aerial cables shows it better not to mark fiber optic cables in any visibly distinct way to avoid deliberate vandalism.
3.3.2 Reducing Physical Protection Costs with Restoration Schemes
The cost of trenching to bury fiber cable can be quite significant and is highly dependent on the depth of burial required. An interesting prospect of using active protection or restoration schemes is that an operator may consider relaxing the burial depth (from 2 m to 1.5 m, say), relative to previous standards for point-to-point transmission systems. An (unpublished) study by the author found this to be quite a viable strategy for one regional carrier. The issue was that existing standards required a 2 meter burial depth for any new cable. It would have been very expensive to trench to 2 meters depth all the way through a certain pass in the Rocky Mountains. But since these cables were destined to be part of either a restorable ring or mesh network, the question arose: "Do we really still have to bury them to two meters?" Indeed, availability calculations (of the general type in Section 3.12) showed that about a thousand-fold increase in the physical failure rate could be sustained (for the same overall system availability) if the fibers in such cables were parts of active self healing rings. Given that the actual increase in failure rate from relaxing the depth by half a meter was much less than a thousand-fold, the economic saving was possible without any net loss of service availability. Essentially the same trade-off becomes an option with mesh-based restorable networking as well and we suggest it as an area of further consideration whenever new cable is to be trenched in.
3.3.3 Physical Layer Topology Considerations
When a cable is severed, higher layers can only restore by rerouting the affected carrier signals over physically diverse surviving systems. Physically disjoint routes must therefore exist in Layer 0. This is a basic requirement for survivability that no other layer can provide or emulate. Before the widespread deployment of fiber, backbone transport was largely based on point-to-point analog and digital microwave radio and the physical topology did not really need such diversity. Self-contained 1:N APS systems would combat fading from multipath propagation effects or single-channel equipment failures but there was no real need for restoration considerations in the sense of recovery from a complete failure of the system. The radio towers themselves were highly robust and one cannot easily "cut" the free space path between the towers. National scale networks consequently tended to have many singly connected nodes and roughly approximated a minimum length tree spanning all nodes. Fiber optics, being cable-based, however forces us to "close" the physical topologies into more mesh-like structures where no single cut can isolate any node from the rest. The evolution this implies is illustrated in Figure 3-4.
Figure 3-4. For survivability against failures that overcome physical protection measures, the physical layer graph must be either two-connected or biconnected.
Technically, the physical route structure must provide either two-connectedness or biconnectedness over the nodes. In a biconnected network, there are at least two fully disjoint paths between each node pair. Two-connectedness implies that two span-disjoint paths exist between all node pairs, but in some cases there may be a node in common between the two paths. Algorithmic tests for this property are discussed in Chapter 4, although these properties are readily apparent to a human viewing a plan diagram of the network. Note that this topological evolution to a closed graph of some form has by itself nothing to do with the speed or type of restoration scheme to be employed. It is just topologically essential to have at least two physically disjoint routes between every node pair for automatic restoration by diverse routing to even be an option.
In practice, however, the acquisition of rights-of-way to enhance physical layer diversity can be very costly. Whereas a spanning tree requires as few as N-1 spans to cover N nodes, and can do so efficiently in terms of total distance, a biconnected graph requires at least N spans (which makes a single but long physical ring) and more typically up to 1.5 N for a reasonably well-connected and distance-minimized topology to support either ring- or mesh-based survivable transport networking. Thus, depending on the legacy network or starting point for evolution to mesh-based operation, a major expense may be incurred in the physical layer to ensure that higher layer survivability schemes can operate. Thus, optimization and evolution of the physical layer topology is one of the fundamental problems faced by modern network operators. This problem is treated further in Chapter 9.
3.3.4 The Problem of Diversity Integrity
Related to the creation of physical layer diversity is the need also to be able to validate the details of physical structures that underlie logical protection or restoration routes to ensure integrity of the mapping from physical to logical diversity. For instance, how does one know that the opposite spans of a SONET ring correspond to cables that are on different sides of the street? Maybe they are on one street but two blocks later they share the same duct or bridge-crossing. This is the issue of shared risk link groups mentioned in Chapter 1. It is one thing to recognize that we will have to take SRLGs into account, but the further point here is that even knowing with certainty what the mapping of each logical path into physical structures is (hence defining the SRLGs) is itself a difficult and important aspect of the physical network. This general issue is one of being able to correlate logical level service or system implementations to the ultimate underlying physical structures. This is a significant administrative challenge because cables and ducts may have been pulled in at different times over several decades. The end points may be known, but correlating these to different outside plant conduits and pole structures, etc., is the problem. Many telcos are investing heavily in geographic information systems and conducting ground-truth audits to keep tabs on all these physical plant details. Without assured physical diversity (or at least knowledge of the SRLGs on a given path pair), attempts to provide redundancy to enable active protection or restoration are easily defeated. More about the problem of ensuring physical diversity follows after our review of protection options at all layers.
The "Red and White" Network
One interesting proposal to address this physical diversity assurance problem, and provide a very simple and universal strategy for survivability, is the concept of a "red and white" network.4 The suggestion, not made frivolously, is to purchase and install every physical item of a network in duplicate and classify one of each pair as either "red" or "white," and literally paint every item accordingly. Only one rule then ever need be applied network-wide: always keep red and white apart, whether cables, power supplies, equipment bays, etc. When followed through the result would be an entirely dual-plane network with complete physical disjointedness between planes. Every application warranting protected service would then be realized once in the red plane and again in the white plane, network-wide. The result is assured physical diversity and the operations and planning simplicity of 1+1 tail-end selection as the network-wide survivability principle, for a 100% investment in redundancy in both node and span equipment. Lest it seem that the idea of completely duplicating the network is unrealistic, it should be pointed out that ring-based networks often embody 200 to 300% capacity redundancy, and although the nodal equipment elements are not completely duplicated in ring-based transport, it is normal to have 1+1 local standby redundancy on high speed circuit packs, processors and power supplies built into these network elements. In contrast each plane of the "red and white" network would use fully non-redundant individual equipment items. Importantly, however, we will see that mesh-based networking can achieve survivability with much less than 100% capacity redundancy and can also provide differentiated levels of service protection.