Our shortest-paths algorithms are based on a simple operation known as relaxation. We start a shortest-paths algorithm knowing only the network's edges and weights. As we proceed, we gather information about the shortest paths that connect various pairs of vertices. Our algorithms all update this information incrementally, making new inferences about shortest paths from the knowledge gained so far. At each step, we test whether we can find a path that is shorter than some known path. The term "relaxation" is commonly used to describe this step, which relaxes constraints on the shortest path. We can think of a rubber band stretched tight on a path connecting two vertices: A successful relaxation operation allows us to relax the tension on the rubber band along a shorter path.
Our algorithms are based on applying repeatedly one of two types of relaxation operations:
-
Edge relaxation: Test whether traveling along a given edge gives a new shortest path to its destination vertex.
-
Path relaxation: Test whether traveling through a given vertex gives a new shortest path connecting two other given vertices.
Edge relaxation is a special case of path relaxation; we consider the operations separately, however, because we use them separately (the former in single-source algorithms; the latter in all-pairs algorithms). In both cases, the prime requirement that we impose on the data structures that we use to represent the current state of our knowledge about a network's shortest paths is that we can update them easily to reflect changes implied by a relaxation operation.
First, we consider edge relaxation, which is illustrated in Figure 21.5. All the single-source shortest-paths algorithms that we consider are based on this step: Does a given edge lead us to consider a shorter path to its destination from the source?
Figure 21.5. Edge relaxation
The data structures that we need to support this operation are straightforward. First, we have the basic requirement that we need to compute the shortest-paths lengths from the source to each of the other vertices. Our convention will be to store in a vertex-indexed array wt the lengths of the shortest known paths from the source to each vertex. Second, to record the paths themselves as we move from vertex to vertex, our convention will be the same as the one that we used for other graph-search algorithms that we examined in Chapters 18 through 20: We use a vertex-indexed array spt to record the last edge on a shortest path from the source to the indexed vertex. These edges constitute a tree.
With these data structures, implementing edge relaxation is a straightforward task. In our single-source shortest-paths code, we use the following code to relax along an edge e from v to w:
if (wt[w] > wt[v] + e.wt()) { wt[w] = wt[v] + e.wt(); spt[w] = e; }
This code fragment is both simple and descriptive; we include it in this form in our implementations, rather than defining relaxation as a higher-level abstract operation.
Definition 21.2 Given a network and a designated vertex s, a shortest-paths tree (SPT) for s is a subnetwork containing s and all the vertices reachable from s that forms a directed tree rooted at s such that every tree path is a shortest path in the network.
There may be multiple paths of the same length connecting a given pair of nodes, so SPTs are not necessarily unique. In general, as illustrated in Figure 21.2, if we take shortest paths from a vertex s to every vertex reachable from s in a network and from the subnetwork induced by the edges in the paths, we may get a DAG. Different shortest paths connecting pairs of nodes may each appear as a subpath in some longer path containing both nodes. Because of such effects, we generally are content to compute any SPT for a given digraph and start vertex.
Our algorithms generally initialize the entries in the wt array with a sentinel value. That value needs to be sufficiently small that the addition in the relaxation test does not cause overflow and sufficiently large that no simple path has a larger weight. For example, if edge weights are between 0 and 1, we can use the value V. Note that we have to take extra care to check our assumptions when using sentinels in networks that could have negative weights. For example, if both vertices have the sentinel value, the relaxation code just given takes no action if e.wt is nonnegative (which is probably what we intend in most implementations), but it will change wt[w] and spt[w] if the weight is negative.
Our code always uses the destination vertex as the index to save the SPT edges (spt[w].w() == w). For economy and consistency with Chapters 17 through 19, we use the notation st[w] to refer to the vertex spt[w].v() (in the text and particularly in the figures) to emphasize that the spt array is actually a parent-link representation of the shortest-paths tree, as illustrated in Figure 21.6. We can compute the shortest path from s to t by traveling up the tree from t to s; when we do so, we are traversing edges in the direction opposite from their direction in the network and are visiting the vertices on the path in reverse order (t, st[t], st[st[t]], and so forth).
Figure 21.6. Shortest-paths trees
One way to get the edges on the path in order from source to sink from an SPT is to use a stack. For example, the following code prints a path from the source to a given vertex w:
EdgeStack P = new EdgeStack(G.V()); Edge e = st[v]; while (e != null) { P.push(e); e = st[e.()]); } if (P.empty()) Out.print(P.top().v()); while (!P.empty()) { Out.print("-" + P.top().w()); P.pop(); }
In a class implementation, we could use code similar to this to provide clients with a array that contains the edges of the path.
If we simply want to print or otherwise process the edges on the path, going all the way through the path in reverse order to get to the first edge in this way may be undesirable. One approach to get around this difficulty is to work with the reverse network, as illustrated in Figure 21.6. We use reverse order and edge relaxation in single-source problems because the SPT gives a compact representation of the shortest paths from the source to all the other vertices, in an array with just V entries.
Next, we consider path relaxation, which is the basis of some of our all-pairs algorithms: Does going through a given vertex lead us to a shorter path that connects two other given vertices? For example, suppose that, for three vertices s, x, and t, we wish to know whether it is better to go from s to x and then from x to t or to go from s to t without going through x. For straight-line connections in a Euclidean space, the triangle inequality tells us that the route through x cannot be shorter than the direct route from s to t, but for paths in a network, it could be (see Figure 21.7). To determine which, we need to know the lengths of paths from s to x, x to t, and of those from s to t (that do not include x). Then, we simply test whether or not the sum of the first two is less than the third; if it is, we update our records accordingly.
Figure 21.7. Path relaxation
Path relaxation is appropriate for all-pairs solutions where we maintain the lengths of the shortest paths that we have encountered between all pairs of vertices. Specifically, in all-pairs shortest-paths code of this kind, we maintain an array of arrays d such that d[s][t] is the shortest-path length from s to t, and we also maintain an array of arrays p such that p[s][t] is the next vertex on a shortest path from s to t. We refer to the former as the distances matrix and the latter as the paths matrix. Figure 21.8 shows the two matrices for our example network. The distances matrix is a prime objective of the computation, and we use the paths matrix because it is clearly more compact than, but carries the same information as, the full list of paths that is illustrated in Figure 21.3.
Figure 21.8. All shortest paths
In terms of these data structures, path relaxation amounts to the following code:
if (d[s][t] > d[s][x] + d[x][t]) { d[s][t] = d[s][x] + d[x][t]; p[s][t] = p[s][x]; }
Like edge relaxation, this code reads as a restatement of the informal description that we have given, so we use it directly in our implementations. More formally, path relaxation reflects the following:
If a vertex x is on a shortest path from s to t, then that path consists of a shortest path from s to x followed by a shortest path from x to t.
Proof: By contradiction. We could use any shorter path from s to x or from x to t to build a shorter path from s to t. ▪
We encountered the path-relaxation operation when we discussed transitive-closure algorithms, in Section 19.3. If the edge and path weights are either 1 or infinite (that is, a path's weight is 1 only if all that path's edges have weight 1), then path relaxation is the operation that we used in Warshall's algorithm (if we have a path from s to x and a path from x to t, then we have a path from s to t). If we define a path's weight to be the number of edges on that path, then Warshall's algorithm generalizes to Floyd's algorithm for finding all shortest paths in unweighted digraphs; it further generalizes to apply to networks, as we see in Section 21.3.
From a mathematician's perspective, it is important to note that these algorithms all can be cast in a general algebraic setting that unifies and helps us to understand them. From a programmer's perspective, it is important to note that we can implement each of these algorithms using an abstract + operator (to compute path weights from edge weights) and an abstract < operator (to compute the minimum value in a set of path weights), both solely in the context of the relaxation operation (see Exercises 19.55 and 19.56).
Property 21.1 implies that a shortest path from s to t contains shortest paths from s to every other vertex along the path to t. Most shortest-paths algorithms also compute shortest paths from s to every vertex that is closer to s than to t (whether or not the vertex is on the path from s to t), although that is not a requirement (see Exercise 21.18). Solving the sourcesink shortest-paths problem with such an algorithm when t is the vertex that is farthest from s is equivalent to solving the single-source shortest-paths problem for s. Conversely, we could use a solution to the single-source shortest-paths problem from s as a method for finding the vertex that is farthest from s.
The paths matrix that we use in our implementations for the all-pairs problem is also a representation of the shortest-paths trees for each of the vertices. We defined p[s][t] to be the vertex that follows s on a shortest path from s to t. It is thus the same as the vertex that precedes s on the shortest path from t to s in the reverse network. In other words, column t in the paths matrix of a network is a vertex-indexed array that represents the SPT for vertex t in its reverse. Conversely, we can build the paths matrix for a network by filling each column with the vertex-indexed array representation of the SPT for the appropriate vertex in the reverse. This correspondence is illustrated in Figure 21.9.
Figure 21.9. All shortest paths in a network
In summary, relaxation gives us the basic abstract operations that we need to build our shortest paths algorithms. The primary complication is the choice of whether to provide the first or final edge on the shortest path. For example, single-source algorithms are more naturally expressed by providing the final edge on the path so that we need only a single vertex-indexed array to reconstruct the path, since all paths lead back to the source. This choice does not present a fundamental difficulty because we can either use the reverse graph as warranted or provide methods that hide this difference from clients. For example, we could specify a method in the interface that returns the edges on the shortest path in an array (see Exercises 21.15 and 21.16).
Accordingly, for simplicity, all of our implementations in this chapter include a method dist that returns a shortest-path length and either a method path that returns the first edge on a shortest path or a method pathR that returns the final edge on a shortest path. For example, our single-source implementations that use edge relaxation typically implement these methods as follows:
Edge pathR(int w) { return spt[w]; } double dist(int v) { return wt[v]; }
Similarly, our all-paths implementations that use path relaxation typically implement these methods as follows:
Edge path(int s, int t) { return p[s][t]; } double dist(int s, int t) { return d[s][t]; }
In some situations, it might be worthwhile to build interfaces that standardize on one or the other or both of these options; we choose the most natural one for the algorithm at hand.
Exercises
-
21.12 Draw the SPT from 0 for the network defined in Exercise 21.1 and for its reverse. Give the parent-link representation of both trees.
-
21.13 Consider the edges in the network defined in Exercise 21.1 to be undirected edges such that each edge corresponds to equal-weight edges in both directions in the network. Answer Exercise 21.12 for this corresponding network.
-
21.14 Change the direction of edge 0-2 in Figure 21.2. Draw two different SPTs that are rooted at 2 for this modified network.
-
21.15 Write a method that uses the pathR method from a single-source implementation to put references to the edges on the path from the source v to a given vertex w in a Java Vector.
-
21.16 Write a method that uses the path method from an all-paths implementation to put references to the edges on the path from a given vertex v to another given vertex w in a Java Vector.
-
21.17 Write a program that uses your method from Exercise 21.16 to print out all of the paths, in the style of Figure 21.3.
-
21.18 Give an example that shows how we could know that a path from s to t is shortest without knowing the length of a shorter path from s to x for some x.