Other Algorithms
Sorting belongs to a class of problems with multiple solutions; that is, there are more than one or two ways to sort an array. Let's look at some other sorting techniques.
Bubble Sort
The bubble sort algorithm is only slightly more sophisticated than selection sort. This algorithm makes a pass through the array, comparing each element to its immediate neighbor. When two neighboring elements are out of order relative to each other, they're swapped. This action causes the largest element to "bubble up" to the last position during the first pass. Therefore, the next pass operates on only the first N-1 elements. The process is repeated on the first N-2 elements, first N-3 elements, and so on. The process continues until no swaps needed to be done during one full pass. When that happens, the algorithm quits.
The pseudo-code is fairly simple, and so is bubble-sort C++ code if you care to write it:
For I = N-1 down to 1 For J = 0 up to but not including I If A[J] > A[J+1] Swap(A[J], A[J+1]) If no swaps were done during this pass, exit.
In the typical case, the duration will be O(n2). But in the best case, with an array that is presorted, the duration is O(n), because the bubble sort quits after just one pass. This makes the bubble sort potentially faster than other algorithms. In general, though, a bubble sort takes O(n2), which makes it a poor algorithm for large N.
Quicksort
The quicksort algorithm is in some ways the most sophisticated sorting algorithm of all. It starts by selecting a "pivot" point within a range. It then places every element that's less than the pivot to its left, and every element that's greater than the pivot to its right. Each pass has a duration of O(n). When this is done recursively for smaller and smaller ranges, the entire array is sorted.
The algorithm can be summarized as follows, in which the ranges are defined to be exclusive of end points:
Quicksort(A[], iBegin, iEnd): If iBegin—iEnd < 2 Return Select pivot P within the range iBegin to iEnd. Partition the range so that all values less than P are to its left and all values greater than P are to its right. As a result, P is placed at a new position, NP. Quicksort(A, iBegin, NP) Quicksort(A, NP, iEnd)
In the typical case, the time duration is O(n log n), matching the speed of a merge sort. However, in the worst case, the time duration is O(n2), which is much less desirable.
The degenerate case can happen because it's not always easy to choose a good pivot point. The most naïve quicksort implementations just select the first or last element in the range as the pivot. But if the array is already sorted, this approach results in a degenerate case in which every range is split into ranges of size 1 and size N-1, thereby making a quicksort as slow as a selection sort! To avoid that problem, quicksort algorithms often use the midpoint (at index iMid) or take the median value of iBegin, iMid, and iEnd. But even that strategy causes poor results if all values in a sub-range are equal.
Despite this worst-case drawback, a quicksort has some advantages over a merge sort. In the typical case, quicksort is usually somewhat faster (typically in the range of 40–50%). But it has another, more substantial advantage. A merge sort requires significant extra space, of size O(n): The size of the temporary array required is equal to that of the array being sorted. This requirement is acceptable only when you have large amounts of memory available. In contrast, the quicksort algorithm needs relatively little extra space. Specifically, it needs only O(log n), which is required for the stack as a result of recursion.
Assuming that the degenerate case doesn't happen, it should be easy to see why the quicksort algorithm makes far fewer comparisons than the selection sort. Again, consider that a selection sort always does N(N-1)/2 comparisons, enough to compare each and every combination of two elements. (A bubble sort makes the same number of comparisons, unless it quits early.) But a quicksort divides a range into two smaller ranges. Once an element is grouped into a sub-range, that element is never compared to anything outside its range. This technique greatly reduces the number of comparisons.
Each level of the algorithm—during which every element is grouped into a smaller left or right partition—takes a total duration of O(n). As long as reasonable pivot points are selected, the sort has log n levels of recursion. This is why quicksort, like merge sort, has a duration of O(n log n).
Lessons Learned
What information can we take away from comparing all these algorithms? Basically, we've observed two rules.
- Opt for the shortest duration. When designing algorithms to be used repeatedly (or in very large projects), consider how the complexity and time duration increase with the size of the data. The smallest durations, in order, are as follows:
- constant, O(1)
- logarithmic, O(log n)
- linear, O(n)
- Accept that tradeoffs are part of the deal. The other point we've seen in this article is that old tradeoff: speed versus compactness (or, put another way, time versus space). The merge-sort algorithm guarantees a duration that's never more than O(n log n), which is both its best and worst case. But the tradeoff for this speed is that this algorithm requires more space. You can't have everything.
But these durations may not always be achievable. If at all possible, avoid exponential time durations such as O(n2). If you can achieve times of O(n log n) instead of O(n2) , that's a major victory for speed and efficiency.