Insertion Sort
In most cases the insertion sort is the best of the elementary sorts described in this chapter. It still executes in O(N2) time, but it's about twice as fast as the bubble sort and somewhat faster than the selection sort in normal situations. It's also not too complex, although it's slightly more involved than the bubble and selection sorts. It's often used as the final stage of more sophisticated sorts, such as quicksort.
Insertion Sort on the Baseball Players
To begin the insertion sort, start with your baseball players lined up in random order. (They wanted to play a game, but clearly there's no time for that.) It's easier to think about the insertion sort if we begin in the middle of the process, when the team is half sorted.
Partial Sorting
At this point there's an imaginary marker somewhere in the middle of the line. (Maybe you threw a red T-shirt on the ground in front of a player.) The players to the left of this marker are partially sorted. This means that they are sorted among themselves; each one is taller than the person to his or her left. However, the players aren't necessarily in their final positions because they may still need to be moved when previously unsorted players are inserted between them.
Note that partial sorting did not take place in the bubble sort and selection sort. In these algorithms a group of data items was completely sorted at any given time; in the insertion sort a group of items is only partially sorted.
The Marked Player
The player where the marker is, whom we'll call the "marked" player, and all the players on her right, are as yet unsorted. This is shown in Figure 3.11.a.
FIGURE 3.11 The insertion sort on baseball players.What we're going to do is insert the marked player in the appropriate place in the (partially) sorted group. However, to do this, we'll need to shift some of the sorted players to the right to make room. To provide a space for this shift, we take the marked player out of line. (In the program this data item is stored in a temporary variable.) This step is shown in Figure 3.11.b.
Now we shift the sorted players to make room. The tallest sorted player moves into the marked player's spot, the next-tallest player into the tallest player's spot, and so on.
When does this shifting process stop? Imagine that you and the marked player are walking down the line to the left. At each position you shift another player to the right, but you also compare the marked player with the player about to be shifted. The shifting process stops when you've shifted the last player that's taller than the marked player. The last shift opens up the space where the marked player, when inserted, will be in sorted order. This step is shown in Figure 3.11.c.
Now the partially sorted group is one player bigger, and the unsorted group is one player smaller. The marker T-shirt is moved one space to the right, so it's again in front of the leftmost unsorted player. This process is repeated until all the unsorted players have been inserted (hence the name insertion sort) into the appropriate place in the partially sorted group.
The InsertSort Workshop Applet
Use the InsertSort Workshop applet to demonstrate the insertion sort. Unlike the other sorting applets, it's probably more instructive to begin with 100 random bars rather than 10.
Sorting 100 Bars
Change to 100 bars with the Size button, and click Run to watch the bars sort themselves before your very eyes. You'll see that the short red outer arrow marks the dividing line between the partially sorted bars to the left and the unsorted bars to the right. The blue inner arrow keeps starting from outer and zipping to the left, looking for the proper place to insert the marked bar. Figure 3.12 shows how this process looks when about half the bars are partially sorted.
The marked bar is stored in the temporary variable pointed to by the magenta arrow at the right end of the graph, but the contents of this variable are replaced so often that it's hard to see what's there (unless you slow down to single-step mode).
Sorting 10 Bars
To get down to the details, use Size to switch to 10 bars. (If necessary, use New to make sure they're in random order.)
FIGURE 3.12 The InsertSort Workshop applet with 100 bars.At the beginning, inner and outer point to the second bar from the left (array index 1), and the first message is Will copy outer to temp. This will make room for the shift. (There's no arrow for inner-1, but of course it's always one bar to the left of inner.)
Click the Step button. The bar at outer will be copied to temp. We say that items are copied from a source to a destination. When performing a copy, the applet removes the bar from the source location, leaving a blank. This is slightly misleading because in a real Java program the reference in the source would remain there. However, blanking the source makes it easier to see what's happening.
What happens next depends on whether the first two bars are already in order (smaller on the left). If they are, you'll see the message Have compared inner-1 and temp, no copy necessary.
If the first two bars are not in order, the message is Have compared inner-1 and temp, will copy inner-1 to inner. This is the shift that's necessary to make room for the value in temp to be reinserted. There's only one such shift on this first pass; more shifts will be necessary on subsequent passes. The situation is shown in Figure 3.13.
On the next click, you'll see the copy take place from inner-1 to inner. Also, the inner arrow moves one space left. The new message is Now inner is 0, so no copy necessary. The shifting process is complete.
No matter which of the first two bars was shorter, the next click will show you Will copy temp to inner. This will happen, but if the first two bars were initially in order, you won't be able to tell a copy was performed because temp and inner hold the same bar. Copying data over the top of the same data may seem inefficient, but the algorithm runs faster if it doesn't check for this possibility, which happens comparatively infrequently.
FIGURE 3.13 The InsertSort Workshop applet with 10 bars.Now the first two bars are partially sorted (sorted with respect to each other), and the outer arrow moves one space right, to the third bar (index 2). The process repeats, with the Will copy outer to temp message. On this pass through the sorted data, there may be no shifts, one shift, or two shifts, depending on where the third bar fits among the first two.
Continue to single-step the sorting process. Again, you can see what's happening more easily after the process has run long enough to provide some sorted bars on the left. Then you can see how just enough shifts take place to make room for the reinsertion of the bar from temp into its proper place.
Java Code for Insertion Sort
Here's the method that carries out the insertion sort, extracted from the insertSort.java program:
public void insertionSort() { int in, out; for(out=1; out<nElems; out++) // out is dividing line { long temp = a[out]; // remove marked item in = out; // start shifts at out while(in>0 && a[in-1] >= temp) // until one is smaller, { a[in] = a[in-1]; // shift item right, --in; // go left one position } a[in] = temp; // insert marked item } // end for } // end insertionSort()
In the outer for loop, out starts at 1 and moves right. It marks the leftmost unsorted data. In the inner while loop, in starts at out and moves left, until either temp is smaller than the array element there, or it can't go left any further. Each pass through the while loop shifts another sorted element one space right.
It may be hard to see the relation between the steps in the InsertSort Workshop applet and the code, so Figure 3.14 is an activity diagram of the insertionSort() method, with the corresponding messages from the InsertSort Workshop applet. Listing 3.3 shows the complete insertSort.java program.
FIGURE 3.14 Activity diagram for insertSort().LISTING 3.3 The insertSort.java Program
// insertSort.java // demonstrates insertion sort // to run this program: C>java InsertSortApp //-------------------------------------------------------------- class ArrayIns { private long[] a; // ref to array a private int nElems; // number of data items //-------------------------------------------------------------- public ArrayIns(int max) // constructor { a = new long[max]; // create the array nElems = 0; // no items yet } //-------------------------------------------------------------- public void insert(long value) // put element into array { a[nElems] = value; // insert it nElems++; // increment size } //-------------------------------------------------------------- public void display() // displays array contents { for(int j=0; j<nElems; j++) // for each element, System.out.print(a[j] + " "); // display it System.out.println(""); } //-------------------------------------------------------------- public void insertionSort() { int in, out; for(out=1; out<nElems; out++) // out is dividing line { long temp = a[out]; // remove marked item in = out; // start shifts at out while(in>0 && a[in-1] >= temp) // until one is smaller, { a[in] = a[in-1]; // shift item to right --in; // go left one position } a[in] = temp; // insert marked item } // end for } // end insertionSort() //-------------------------------------------------------------- } // end class ArrayIns //////////////////////////////////////////////////////////////// class InsertSortApp { public static void main(String[] args) { int maxSize = 100; // array size ArrayIns arr; // reference to array arr = new ArrayIns(maxSize); // create the array arr.insert(77); // insert 10 items arr.insert(99); arr.insert(44); arr.insert(55); arr.insert(22); arr.insert(88); arr.insert(11); arr.insert(00); arr.insert(66); arr.insert(33); arr.display(); // display items arr.insertionSort(); // insertion-sort them arr.display(); // display them again } // end main() } // end class InsertSortApp ////////////////////////////////////////////////////////////////
Here's the output from the insertSort.java program; it's the same as that from the other programs in this chapter:
77 99 44 55 22 88 11 0 66 33 0 11 22 33 44 55 66 77 88 99
Invariants in the Insertion Sort
At the end of each pass, following the insertion of the item from temp, the data items with smaller indices than outer are partially sorted.
Efficiency of the Insertion Sort
How many comparisons and copies does this algorithm require? On the first pass, it compares a maximum of one item. On the second pass, it's a maximum of two items, and so on, up to a maximum of N-1 comparisons on the last pass. This is
1 + 2 + 3 + ... + N-1 = N*(N-1)/2
However, because on each pass an average of only half of the maximum number of items are actually compared before the insertion point is found, we can divide by 2, which gives
N*(N-1)/4
The number of copies is approximately the same as the number of comparisons. However, a copy isn't as time-consuming as a swap, so for random data this algorithm runs twice as fast as the bubble sort and faster than the selection sort.
In any case, like the other sort routines in this chapter, the insertion sort runs in O(N2) time for random data.
For data that is already sorted or almost sorted, the insertion sort does much better. When data is in order, the condition in the while loop is never true, so it becomes a simple statement in the outer loop, which executes N-1 times. In this case the algorithm runs in O(N) time. If the data is almost sorted, insertion sort runs in almost O(N) time, which makes it a simple and efficient way to order a file that is only slightly out of order.
However, for data arranged in inverse sorted order, every possible comparison and shift is carried out, so the insertion sort runs no faster than the bubble sort. You can check this using the reverse-sorted data option (toggled with New) in the InsertSort Workshop applet.