AP CS Sorting
 

Sorts


There are many common algorithms used for sorting through collections. Here are some important ones to know:

Selection Sort

Selection sort is an iterative sort algorithm that uses a "search and swap" approach to sort a collection. For each pass through the collection, the algorithm finds the smallest element to be sorted and swaps it with the first unsorted element in the collection. The algorithm continues in this manner, finding the next smallest element in the collection and swapping it with the next unsorted element. Finally, when just two unsorted elements remain, they are compared (and if necessary, swapped) and the sort is complete.

Efficiency

The efficiency of the selection sort algorithm and the number of comparisons made do not depend on the initial order of the elements in the collection; the same number of comparisons is made, regardless of initial order. For a collection of n elements, the collection is sorted after n-1 passes. After the kth pass, the first k elements have been sorted and are placed in their final sorted positions. In the kth pass, n-k comparisons are made. The total number of comparisons made after the sort is complete is n(n-1)/2. The selection sort algorithm has an efficiency of O(n2) in the best, average, and worst cases. In other words, there is no best or worst case for selection sort; every case is considered average.

*can sort only an array of integers....


Insertion Sort

Insertion sort is also an iterative sort algorithm. When using this algorithm, the first element is "sorted" with respect to itself. The algorithm then takes the second element and compares it with the first, inserting it in front of the first one if necessary. From this point forward, the algorithm takes the next unsorted element and compares it with the sorted elements, inserting the element in the correct position as needed. The algorithm continues in this manner, shifting sorted elements as needed to make room for the next element to be inserted.

Efficiency

The efficiency of the insertion sort algorithm and the number of comparisons made do depend on the initial order of the elements in the collection. For an array of n elements, the array is sorted after n-1 passes. The efficiency differs in the best, average, and worst cases.

  • The best case for insertion cases occurs if the collection is already initially sorted in order. Each pass through the array will require only one comparison, which will indicate that the element being compared is already in its correct position with respect to the sorted list. No elements will need to be moved. The total number of comparisons made is (n-1). In this case, the insertion sort algorithm has an efficiency of O(n).
  • The worst case for insertion sort occurs if the collection is initially sorted in reverse order, which will result in the maximum possible number of comparisons and moves being needed to sort the collection. After the kth pass, the first k+1 elements are sorted with respect to each other but are not placed in their final sorted positions. In the kth pass, k comparisons are made. The total number of comparisons made after the sort is complete is n(n-1)/2. In this case, the insertion sort algorithm has an efficiency of O(n2).

*can sort both an array of integers as well as an array of objects....


Mergesort

Mergesort is a recursive algorithm that uses a "divide and conquer" approach to sorting collections. Each time the algorithm is called, the algorithm checks to see if there is more than one element in the collection, and if there is, the collection is "broken" into two halves. If there is even number of elements, the halves are equal, but if there is an odd number of elements, the left half will contain one more element than the right half. At this point, the algorithm uses recursion, calling itself to first mergesort the left half of the collection, then mergesort the right half. When the algorithm calls itself to mergesort one of the halves, it again "breaks" the half into two halves, then calls itself to sort each half. This process is repeated until the entire collection is "broken" into individual elements, or subcollections of length 1. When the method calls itself to sort one of these, the initial test that sees if there is more than one element in the subcollection fails, since there is only one element in the subcollection. This is where the recursion stops, and the algorithm returns to sorting the half by comparing the two adjacent elements, sorting them, and then recombining them into a sorted subcollection. This process continues until all of the individual elements have been sorted and recombined into sorted subcollections. Then the subcollections themselves are compared, sorted, and recombined. This process continues until the subcollections have been recombined to form two sorted halves. Finally, the left and right halves are compared to each other, sorted, and recombined to create the final, fully sorted collection.

Pseudocode

Here is a pseudocode representation of the mergesort algorithm:

If there is more than one element in the collection
     Break the collection into two halves
     Mergesort the left half
     Mergesort the right half
     Compare the two halves
     Merge the two subcollections into a sorted collection

Efficiency

The efficiency of mergesort is not affected by the initial ordering of the elements. Because mergesort requires the use of temporary collections that are as large as the collections to be sorted, it should not be used if only limited memory space is available. This algorithm consists of multiple parts:

  • The part of the algorithm that merges the subcollections first compares each element in the subcollections, which is an O(n) process, then copies the elements from the temporary collection back into the original collection, which is also an O(n) process. This makes the actual merging of the algorithm requires a total of 2n operations, making this part of the algorithm an O(n) process. (Remember that constants are ignored in Big-O notation.)
  • The "breaking" of the collection of n elements into n subcollections of one element each requires log2n divisions, which is an O(logn) process.

Remember that for each division of the collection into subcollections, the merging part of the algorithm must be called. This makes the mergesort algorithm have an efficiency of O(nlogn) in the best, average, and worst cases. In other words, there is no best or worst case because every case is considered average.

*can sort both an array of integers as well as an array of objects....