Python Programming

Lecture 15 Sorting Algorithms, Greedy Algorithm

15.1 Sorting Algorithms

  • Sorting is the process of placing elements from a collection in some kind of order. There are many, many sorting algorithms that have been developed and analyzed. This suggests that sorting is an important area of study in computer science.
    • Bubble Sort
    • Selection Sort
    • Insertion Sort
    • Merge Sort
    • Quick Sort

1. Bubble sort

  • The bubble sort makes multiple passes through a list. It compares adjacent items and exchanges those that are out of order. Each pass through the list places the next largest value in its proper place. In essence, each item “bubbles” up to the location where it belongs.
  • If there are $n$ items in the list, then there are $n$ − 1 pairs of items that need to be compared on the first pass. It is important to note that once the largest value in the list is part of a pair, it will continually be moved along until the pass is complete.
  • At the start of the second pass, the largest value is now in place. There are $n$ − 1 items left to sort, meaning that there will be $n$ − 2 pairs. Since each pass places the next largest value in place, the total number of passes necessary will be $n$ − 1. After completing the $n$ − 1 passes, the smallest item must be in the correct position with no further processing required.

def bubble_sort(a_list):
    for pass_num in range(len(a_list) - 1, 0, -1):
        for i in range(pass_num):
            if a_list[i] > a_list[i+1]:
                a_list[i+1], a_list[i] = a_list[i], a_list[i+1]

              
a_list = [54, 26, 93, 17, 77, 31, 44, 55, 20]
bubble_sort(a_list)
print(a_list)

2. Selection sort

  • The selection sort improves on the bubble sort by making only one exchange for every pass through the list.
  • As with a bubble sort, after the first pass, the largest item is in the correct place. After the second pass, the next largest is in place. This process continues and requires $n$ − 1 passes to sort $n$ items, since the final item must be in place after the ($n$ − 1)st pass.
  • Suppose you have a bunch of music on your computer. For each artist, you have a play count.

  • One way is to go through the list and find the most-played artist. Add that artist to a new list.

  • To find the artist with the highest play count, you have to check each item in the list. This takes $O(n)$ time, as you just saw. So you have an operation that takes $O(n)$ time, and you have to do that n times:

  • This takes $O(n × n)$ time or $O(n^2)$ time.


def selection_sort(a_list):
    for fill in range(len(a_list) - 1, 0, -1):
        pos_max = 0
        for location in range(1, fill + 1):
            if a_list[location] > a_list[pos_max]:
                pos_max = location
        a_list[pos_max], a_list[fill] = a_list[fill], a_list[pos_max]

3. Insertion Sort


def insertion_sort(a_list):
    for index in range(1, len(a_list)):
        current_value = a_list[index]
        position = index
        while position > 0 and a_list[position - 1] > current_value:
            a_list[position] = a_list[position - 1]
            position = position - 1
        a_list[position] = current_value

15.2 Sorting with Recursion

4. Merge Sort

  • Splitting the List in a Merge Sort
  • Merge Together
  • Splitting the List in a Merge Sort

def merge_sort(a_list):
    print("Splitting ", a_list)
    if len(a_list) > 1:
        mid = len(a_list) // 2
        left_half = a_list[:mid]
        right_half = a_list[mid:]

        merge_sort(left_half)
        merge_sort(right_half)
        i = 0
        j = 0
        k = 0
  • Merge Together

#continue
        while i < len(left_half) and j < len(right_half):
            if left_half[i] < right_half[j]:
                a_list[k] = left_half[i]
                i = i + 1
            else:
                a_list[k] = right_half[j]
                j = j + 1
            k = k + 1

        while i < len(left_half):
            a_list[k] = left_half[i]
            i = i + 1
            k = k + 1

        while j < len(right_half):
            a_list[k] = right_half[j]
            j = j + 1
            k = k + 1
    print("Merging ", a_list)
  • In order to analyze the merge_sort function, we need to consider the two distinct processes that make up its implementation.
  • The result of this analysis is that log $n$ splits, each of which costs $n$ for a total of $n$ log $n$ operations. A merge sort is an $O(n log n)$ algorithm.

5. Quick Sort

  • Base case

  • An array with two elements is pretty easy to sort, too.

  • What about an array of three elements?

  • We use D&C to solve this problem. Let's pick a pivot at first, say, 33.

  • This is called partitioning. Now you have:

    A sub-array of all the numbers less than the pivot

    The pivot

    A sub-array of all the numbers greater than the pivot

  • If the sub-arrays are sorted, then you can combine the whole thing like this—left array + pivot + right array—and you get a sorted array.

  • Suppose you have this array of five elements.

  • For example, suppose you pick 3 as the pivot. You call quicksort on the sub-arrays.

  • Quicksort is unique because its speed depends on the pivot you choose.

  • Actually, the big O of the quick sort algorithm depends on the pivot you pick.

  • In the best case, the big O of quick sort is $O(nlogn)$. However, in the worst case, the big O of it turns to be $O(n^2)$.

  • Why?

  • Worst Case

  • Best Case

  • The average case is the best case, if you pick pivot randomly.

  • A variant of the Insertion Sort: Shell Sort
  • The sorting algorithm in Python: Timsort
  • Timsort is a hybrid stable sorting algorithm, derived from merge sort and insertion sort, designed to perform well on many kinds of real-world data.

15.3 Greedy Algorithm

Greedy Algorithm (Heuristic Algorithm, 一种启发式算法)

  • A very simple problem-solving strategy
  • So-called greedy algorithms are short-sighted, in that they make each choice in isolation, doing what looks good right here, right now. In many ways, eager or impatient might be better names for them.
  • Example: The classroom scheduling problem
  • Suppose you have a classroom and want to hold as many classes here as possible. You get a list of classes.
  • You want to hold as many classes as possible in this classroom. How do you pick what set of classes to hold, so that you get the biggest set of classes possible?
  • Here's how the greedy algorithm works
  • Pick the class that ends the soonest. This is the first class you'll hold in this classroom.
  • Now, you have to pick a class that starts after the first class. Again, pick the class that ends the soonest. This is the second class you'll hold.

Python Solution by Greedy Algorithm


start_time = [9,9.5,10,10.5,11]
end_time = [10,10.5,11,11.5,12]
selected = []
earlist_end = 0

for index, end in enumerate(end_time):
    if start_time[index] >= earlist_end:
        selected.append(index)
        earlist_class = index
        earlist_end = end
print(selected)

Summary

  • Sorting Algorithms
  • Greedy Algorithm