Bubble Sort
The bubble sort is notoriously slow, but it's conceptually the simplest of the sorting algorithms and for that reason is a good beginning for our exploration of sorting techniques.
Bubble Sort on the Baseball Players
Imagine that you're near-sighted (like a computer program) so that you can see only two of the baseball players at the same time, if they're next to each other and if you stand very close to them. Given this impediment, how would you sort them? Let's assume there are N players, and the positions they're standing in are numbered from 0 on the left to N-1 on the right.
The bubble sort routine works like this: You start at the left end of the line and compare the two kids in positions 0 and 1. If the one on the left (in 0) is taller, you swap them. If the one on the right is taller, you don't do anything. Then you move over one position and compare the kids in positions 1 and 2. Again, if the one on the left is taller, you swap them. This sorting process is shown in Figure 3.3.
Here are the rules you're following:
-
Compare two players.
-
If the one on the left is taller, swap them.
-
Move one position right.
-
When you reach the first sorted player, start over at the left end of the line.
FIGURE 3.3 Bubble sort: the beginning of the first pass.
You continue down the line this way until you reach the right end. You have by no means finished sorting the kids, but you do know that the tallest kid is on the right. This must be true because, as soon as you encounter the tallest kid, you'll end up swapping him (or her) every time you compare two kids, until eventually he (or she) will reach the right end of the line. This is why it's called the bubble sort: As the algorithm progresses, the biggest items "bubble up" to the top end of the array. Figure 3.4 shows the baseball players at the end of the first pass.
FIGURE 3.4 Bubble sort: the end of the first pass.
After this first pass through all the data, you've made N-1 comparisons and somewhere between 0 and N-1 swaps, depending on the initial arrangement of the players. The item at the end of the array is sorted and won't be moved again.
Now you go back and start another pass from the left end of the line. Again, you go toward the right, comparing and swapping when appropriate. However, this time you can stop one player short of the end of the line, at position N-2, because you know the last position, at N-1, already contains the tallest player. This rule could be stated as:
You continue this process until all the players are in order. Describing this process is much harder than demonstrating it, so let's watch the BubbleSort Workshop applet at work.
The BubbleSort Workshop Applet
Start the BubbleSort Workshop applet. You'll see something that looks like a bar graph, with the bar heights randomly arranged, as shown in Figure 3.5.
FIGURE 3.5 The BubbleSort Workshop applet.The Run Button
This Workshop applet contains a two-speed graph: You can either let it run by itself, or you can single-step through the process. To get a quick idea what happens, click the Run button. The algorithm will bubble-sort the bars. When it finishes, in 10 seconds or so, the bars will be sorted, as shown in Figure 3.6.
FIGURE 3.6 After the bubble sort.The New Button
To do another sort, press the New button. New creates a new set of bars and initializes the sorting routine. Repeated presses of New toggle between two arrangements of bars: a random order, as shown in Figure 3.5, and an inverse ordering where the bars are sorted backward. This inverse ordering provides an extra challenge for many sorting algorithms.
The Step Button
The real payoff for using the BubbleSort Workshop applet comes when you single-step through a sort. You can see exactly how the algorithm carries out each step.
Start by creating a new randomly arranged graph with New. You'll see three arrows pointing at different bars. Two arrows, labeled inner and inner+1, are side by side on the left. Another arrow, outer, starts on the far right. (The names are chosen to correspond to the inner and outer loop variables in the nested loops used in the algorithm.)
Click once on the Step button. You'll see the inner and the inner+1 arrows move together one position to the right, swapping the bars if appropriate. These arrows correspond to the two players you compared, and possibly swapped, in the baseball scenario.
A message under the arrows tells you whether the contents of inner and inner+1 will be swapped, but you know this just from comparing the bars: If the taller one is on the left, they'll be swapped. Messages at the top of the graph tell you how many swaps and comparisons have been carried out so far. (A complete sort of 10 bars requires 45 comparisons and, on the average, about 22 swaps.)
Continue pressing Step. Each time inner and inner+1 finish going all the way from 0 to outer, the outer pointer moves one position to the left. At all times during the sorting process, all the bars to the right of outer are sorted; those to the left of (and at) outer are not.
The Size Button
The Size button toggles between 10 bars and 100 bars. Figure 3.7 shows what the 100 random bars look like.
You probably don't want to single-step through the sorting process for 100 bars, unless you're unusually patient. Press Run instead, and watch how the blue inner and inner+1 pointers seem to find the tallest unsorted bar and carry it down the row to the right, inserting it just to the left of the previously sorted bars.
Figure 3.8 shows the situation partway through the sorting process. The bars to the right of the red (longest) arrow are sorted. The bars to the left are beginning to look sorted, but much work remains to be done.
FIGURE 3.7 The BubbleSort applet with 100 bars.FIGURE 3.8 The 100 partly sorted bars.
If you started a sort with Run and the arrows are whizzing around, you can freeze the process at any point by pressing the Step button. You can then single-step to watch the details of the operation or press Run again to return to high-speed mode.
The Draw Button
Sometimes while running the sorting algorithm at full speed, the computer takes time off to perform some other task. This can result in some bars not being drawn. If this happens, you can press the Draw button to redraw all the bars. Doing so pauses the run, so you'll need to press the Run button again to continue.
You can press Draw at any time there seems to be a glitch in the display.
Java Code for a Bubble Sort
In the bubbleSort.java program, shown in Listing 3.1, a class called ArrayBub encapsulates an array a[], which holds variables of type long.
In a more serious program, the data would probably consist of objects, but we use a primitive type for simplicity. (We'll see how objects are sorted in the objectSort.java program in Listing 3.4.) Also, to reduce the size of the listing, we don't show find() and delete() methods with the ArrayBub class, although they would normally be part of a such a class.
LISTING 3.1 The bubbleSort.java Program
// bubbleSort.java // demonstrates bubble sort // to run this program: C>java BubbleSortApp //////////////////////////////////////////////////////////////// class ArrayBub { private long[] a; // ref to array a private int nElems; // number of data items //-------------------------------------------------------------- public ArrayBub(int max) // constructor { a = new long[max]; // create the array nElems = 0; // no items yet } //-------------------------------------------------------------- public void insert(long value) // put element into array { a[nElems] = value; // insert it nElems++; // increment size } //-------------------------------------------------------------- public void display() // displays array contents { for(int j=0; j<nElems; j++) // for each element, System.out.print(a[j] + " "); // display it System.out.println(""); } //-------------------------------------------------------------- public void bubbleSort() { int out, in; for(out=nElems-1; out>1; out--) // outer loop (backward) for(in=0; in<out; in++) // inner loop (forward) if( a[in] > a[in+1] ) // out of order? swap(in, in+1); // swap them } // end bubbleSort() //-------------------------------------------------------------- private void swap(int one, int two) { long temp = a[one]; a[one] = a[two]; a[two] = temp; } //-------------------------------------------------------------- } // end class ArrayBub //////////////////////////////////////////////////////////////// class BubbleSortApp { public static void main(String[] args) { int maxSize = 100; // array size ArrayBub arr; // reference to array arr = new ArrayBub(maxSize); // create the array arr.insert(77); // insert 10 items arr.insert(99); arr.insert(44); arr.insert(55); arr.insert(22); arr.insert(88); arr.insert(11); arr.insert(00); arr.insert(66); arr.insert(33); arr.display(); // display items arr.bubbleSort(); // bubble sort them arr.display(); // display them again } // end main() } // end class BubbleSortApp ////////////////////////////////////////////////////////////////
The constructor and the insert() and display() methods of this class are similar to those we've seen before. However, there's a new method: bubbleSort(). When this method is invoked from main(), the contents of the array are rearranged into sorted order.
The main() routine inserts 10 items into the array in random order, displays the array, calls bubbleSort() to sort it, and then displays it again. Here's the output:
77 99 44 55 22 88 11 0 66 33 0 11 22 33 44 55 66 77 88 99
The bubbleSort() method is only four lines long. Here it is, extracted from the listing:
public void bubbleSort() { int out, in; for(out=nElems-1; out>1; out--) // outer loop (backward) for(in=0; in<out; in++) // inner loop (forward) if( a[in] > a[in+1] ) // out of order? swap(in, in+1); // swap them } // end bubbleSort()
The idea is to put the smallest item at the beginning of the array (index 0) and the largest item at the end (index nElems-1). The loop counter out in the outer for loop starts at the end of the array, at nElems-1, and decrements itself each time through the loop. The items at indices greater than out are always completely sorted. The out variable moves left after each pass by in so that items that are already sorted are no longer involved in the algorithm.
The inner loop counter in starts at the beginning of the array and increments itself each cycle of the inner loop, exiting when it reaches out. Within the inner loop, the two array cells pointed to by in and in+1 are compared, and swapped if the one in in is larger than the one in in+1.
For clarity, we use a separate swap() method to carry out the swap. It simply exchanges the two values in the two array cells, using a temporary variable to hold the value of the first cell while the first cell takes on the value in the second and then setting the second cell to the temporary value. Actually, using a separate swap() method may not be a good idea in practice because the function call adds a small amount of overhead. If you're writing your own sorting routine, you may prefer to put the swap instructions in line to gain a slight increase in speed.
Invariants
In many algorithms there are conditions that remain unchanged as the algorithm proceeds. These conditions are called invariants. Recognizing invariants can be useful in understanding the algorithm. In certain situations they may also be helpful in debugging; you can repeatedly check that the invariant is true, and signal an error if it isn't.
In the bubbleSort.java program, the invariant is that the data items to the right of out are sorted. This remains true throughout the running of the algorithm. (On the first pass, nothing has been sorted yet, and there are no items to the right of out because it starts on the rightmost element.)
Efficiency of the Bubble Sort
As you can see by watching the BubbleSort Workshop applet with 10 bars, the inner and inner+1 arrows make nine comparisons on the first pass, eight on the second, and so on, down to one comparison on the last pass. For 10 items, this is
9 + 8 + 7 + 6 + 5 + 4 + 3 + 2 + 1 = 45
In general, where N is the number of items in the array, there are N-1 comparisons on the first pass, N-2 on the second, and so on. The formula for the sum of such a series is
(N1) + (N2) + (N3) + ... + 1 = N*(N1)/2
N*(N1)/2 is 45 (10*9/2) when N is 10.
Thus, the algorithm makes about N2⁄2 comparisons (ignoring the 1, which doesn't make much difference, especially if N is large).
There are fewer swaps than there are comparisons because two bars are swapped only if they need to be. If the data is random, a swap is necessary about half the time, so there will be about N2⁄4 swaps. (Although in the worst case, with the initial data inversely sorted, a swap is necessary with every comparison.)
Both swaps and comparisons are proportional to N2. Because constants don't count in Big O notation, we can ignore the 2 and the 4 and say that the bubble sort runs in O(N2) time. This is slow, as you can verify by running the BubbleSort Workshop applet with 100 bars.
Whenever you see one loop nested within another, such as those in the bubble sort and the other sorting algorithms in this chapter, you can suspect that an algorithm runs in O(N2) time. The outer loop executes N times, and the inner loop executes N (or perhaps N divided by some constant) times for each cycle of the outer loop. This means you're doing something approximately N*N or N2 times.