- Old Reliable: The Selection Sort
- The Merge-Sort Algorithm
- Why Is One Algorithm Faster Than Others?
- Other Algorithms
The Merge-Sort Algorithm
One of the faster sorting techniques is the merge-sort algorithm, but it has one major drawback: It requires the allocation of a temporary array as large as the array to be sorted. If memory is scarce, the merge-sort algorithm might not be feasible.
Consider what it takes to merge two already sorted groups into a single sorted group. For example:
1 4 9 3 5 7
Once a merge is performed, the resulting array contains the following:
1 3 4 5 7 9
It's not too hard to come up with an algorithm that performs the merge. We keep track of an index (I or J) for each sub-range and then step through, comparing the current element in one group to the current element in the other. Only after an element is copied do we advance the index to the next element.
In this example, we begin by comparing 1 in the first list to 3 in the second list. Because 1 is the smaller number, we copy 1 to the temporary array and advance the first index to point to 4. Then we compare 4 in the first list to 3 in the second list. Now 3 is the smaller number, so we copy that number to the temporary array and advance the second index.
Here's the pseudo-code:
Initialize index I to the beginning of first sub-range. Initialize index J to the beginning of second sub-range. While neither sub-range's end point is reached, If A[I] < A[J] Copy A[I] to temporary array and increment I. Else Copy A[J] to next temporary array and increment J. While index I has not reached its end point, Copy A[I] to next temporary array and increment I. While index J has not reached its end point, Copy A[J] to next temporary array and increment J. Copy contents of the temporary array back to the first array.
Eventually, one of the two indexes is incremented to its end point. After that happens, the remainder of the other sub-range is copied. The result creates a temporary array that has all the elements in one continuous order. The final step is to copy these elements back to the first array.
Roughly N comparisons are performed, where N is the size of the groups put together. The algorithm also has to copy elements to and from the temporary array, but the total amount of work is still proportional to N, the number of elements involved.
Therefore a merge operation has a cost in time of O(n).
Now, how can we advance from this limited ability (merging two presorted sub-ranges) to the much more powerful ability of sorting any array whatsoever?
It's easy to see that if a sub-range of an array is of size 1 (that is, the range contains only one member), it's already sorted. That's a trivial observation, but it's going to be important.
What if a range is of size 2? For example:
7 3
We can break this range into two sub-ranges, each of size 1. Remember, sub-ranges of size 1 are already sorted! Merging these two "ranges" produces one sorted range, two elements in length:
3 7
Assume this has been done for two groups of 2. We now have two pre-sorted sub-ranges, and we can merge them to produce a single range, four elements in length. For example:
3 7 2 10
Merging the two sub-ranges (3, 7 and 2, 10) produces the following:
2 3 7 10
A range of four elements has been fully sorted. Next, assume that we've done this for two ranges of size 4.
2 3 7 10 1 8 9 11
We now do one last merge, to produce:
1 2 3 7 8 9 10 11
Voilà! A range eight members in length has been sorted.
We can repeat this process to sort ever-larger ranges, merging ranges of size 16, and then 32, 64, 128, and so on. The size of these ranges increases exponentially.
Look at the situation another way: As the size of the array (N) increases, how fast does the number of levels increase? The answer is an inverse-exponentiation function; that is, the logarithm of N, or log N. This figure increases relatively slowly. For example, it takes only one more level to sort up to 128 elements than to sort 64.
The total amount of work to be done at each level is always O(n). Therefore, the total time taken by this algorithm is as follows:
O(n log n)
For large N, this is less expensive than O(n2). Later in this article, I'll demonstrate how much difference this makes in practice. But first, let's code a merge sort.
Programming a Merge Sort
The previous section analyzed the merge-sort algorithm through a bottom-up approach. To code the algorithm, we need to take a top-down approach and use recursion. Recursion causes a function to call itself over and over until it reaches the terminal case, at which point the function starts returning, and things move from the bottom up. Here's the pseudo-code:
MergeSort(A[], iBegin, iEnd): If the size of the range is less than 2 Return immediately (we're already sorted!) Find the midpoint index, M, by averaging indexes iBegin and iEnd. MergeSort(A, iBegin, M). (Recursive call!) MergeSort(A, M, iEnd). (Recursive call!) Merge the sub-ranges (iBegin, M) and (M, iEnd).
These ranges are exclusive of their end points. For example, the range iBegin to M includes positions up to but not including M. The range M to iEnd includes positions up to but not including iEnd.
The C++ source code for a merge sort function is rather simple, although the code for the merge itself is a little more involved:
// Merge Ranges: Merge presorted sub-range iBegin to iMid // with presorted sub-range iMid to iEnd. // Ranges are exclusive of end points. void merge_ranges(int A[], int TempArray[], int iBegin, int iMid, int iEnd) { int i = iBegin; // i will index first sub-range. int j = iMid; // j will index second sub-range. int k = i; // index into temporary array // While neither end point has been reached, // Use A[i] < A[j] comparison to step through, always // copying the lesser of the two indexed elements. while (i < iMid && j < iEnd) { if (A[i] < A[j]) TempArray[k++] = A[i++]; else TempArray[k++] = A[j++]; } // Now, eat the "tail" remaining... while (i < iMid) TempArray[k++] = A[i++]; while (j < iEnd) TempArray[k++] = A[j++]; // Finally, copy back contents of temp. array. for (int i = iBegin; i < iEnd; i++) A[i] = TempArray[i]; } void merge_sort(int A[], int TempArray[], int iBegin, int iEnd) { if (iEnd—iBegin < 2) // If range is size 1, it is sorted! return; int iMid = (iBegin + iEnd) / 2; // Calculate midpoint. // Recursively sort each half of range. merge_sort(A, TempArray, iBegin, iMid); merge_sort(A, TempArray, iMid, iEnd); // Now, merge these two sorted ranges. merge_ranges(A, TempArray, iBegin, iMid, iEnd); }