Sorting means putting items in a particular order. That particular order is determined by the comparison property of the elements. In case of integers we say smaller number comes first and bigger number appear later. Arranging items in particular order improves the searching of the element. Hence sorting has heavy usage in computer science.
In this blog we will go through common sorting algorithms. We will see implementation of each in python. To compare their runtime I used Leetcode question on sorting array. The constraint for the question is as follows.
I solvedthis problem with all common sorting algorithms. Below are the execution results.
Image by Author
Note:TLEis time limit exceeded.
Bubble Sort
Simplest sorting algorithm. Iterates over the list, in each iteration it compares elements in pairs and keeps swapping them such that the larger element is moved towards the end of the list.
Non recursive Stable In place O(n²)
def bubbleSort(array): swapped = False for i in range(len(array)-1,0,-1): for j in range(i): if array[j]>array[j+1]: array[j], array[j+1] = array[j+1], array[j] swapped= True if swapped: swapped=False else: break return array
Selection Sort
In this algorithm we create two segments of the list one sorted and other unsorted. We continuously remove the smallest element from the unsorted segment of the list and append it to the sorted segment. We don’t swap intermediate elements. Hence this algorithm sorts the array in minimum number of swaps.
Non recursive Unstable In place O(n²)
def selectionSort(array): for i in range(len(array)-1): min_idx = i for idx in range(i + 1, len(array)-1): if array[idx] < array[min_idx]: min_idx = idx array[i], array[min_idx] = array[min_idx], array[i] return array
Insertion Sort
Like Selection Sort, in this algorithm we segment the list into sorted and unsorted parts. Then we iterate over the unsorted segment, and insert the element from this segment into the correct position in the sorted list.
Non-recursive Stable In place O(n²)
def insertionSort(array): for i in range(1, len(array)): key = array[i] j = i-1 while array[j] > key and j >= 0: array[j+1] = array[j] j -= 1 array[j+1] = key return array
Shell Sort
Shell sort is an optimization over insertion sort. This is achieved by repeatedly doing insertion sort on all elements at fixed, decreasing intervals. Last iteration the interval is 1. Here it becomes a regular insertion sort and it guarantees that the array will be sorted. But to note one point is that by the time we do that array is almost sorted, hence the iteration is very fast.
Non-recursive Stable In place 0(n²) also depends on interval choice
def shellSort(array): n = len(array) k = int(math.log2(n)) interval = 2**k -1 while interval > 0: for i in range(interval, n): temp = array[i] j = i while j >= interval and array[j - interval] > temp: array[j] = array[j - interval] j -= interval array[j] = temp k -= 1 interval = 2**k -1 return array
Heap Sort
Like last two previous algorithms we create two segments of the list one sorted and one unsorted. In this we use heap data structure to efficiently get the max element from the unsorted segment of the list. Heapify method uses recursion to get the max element at the top.
Non-recursive Unstable In place O(nlogn)
def heapify(array, n, i): largest = i l = 2 * i + 1 r = 2 * i + 2
if l < n and array[i] < array[l]: largest = l if r < n and array[largest] < array[r]: largest = r
if largest != i: array[i], array[largest] = array[largest], array[i] heapify(array, n, largest)
def heapSort(array): n = len(array) for i in range(n//2, -1, -1): heapify(array, n, i) for i in range(n-1, 0, -1): array[i], array[0] = array[0], array[i] heapify(array, i, 0) return array
Merge Sort
This is a divide and conquer algorithm. In this algorithm we split a list in half, and keeps splitting the list by 2 until it only has single element. Then we merge the sorted sorted list. We keep doing this until we get a sorted list with all the elements of the unsorted input list.
Recursive Stable Needs extra space O(nlogn)
def mergeSort(nums): if len(nums)==1: return nums mid = (len(nums)-1) // 2 lst1 = mergeSort(nums[:mid+1]) lst2 = mergeSort(nums[mid+1:]) result = merge(lst1, lst2) return resultdef merge(lst1, lst2): lst = [] i = 0 j = 0 while(i<=len(lst1)-1 and j<=len(lst2)-1): if lst1[i]<lst2[j]: lst.append(lst1[i]) i+=1 else: lst.append(lst2[j]) j+=1 if i>len(lst1)-1: while(j<=len(lst2)-1): lst.append(lst2[j]) j+=1 else: while(i<=len(lst1)-1): lst.append(lst1[i]) i+=1 return lst
Quick Sort
In this algorithm we partition the list around a pivot element, sorting values around the pivot. In my solution I used the the last element from the list as pivot value. Best performance is achieved when the pivot value splits the list in two almost equal halves.
Recursive In place Unstable O(nlogn)
def quickSort(array): if len(array)> 1: pivot=array.pop() grtr_lst, equal_lst, smlr_lst = [], [pivot], [] for item in array: if item == pivot: equal_lst.append(item) elif item > pivot: grtr_lst.append(item) else: smlr_lst.append(item) return (quickSort(smlr_lst) + equal_lst + quickSort(grtr_lst)) else: return array
Counting Sort
This algorithm does not do comparison between the elements. We use the mathematical properties of the integers to sort. We count how many time a number has come and store the count in the array where index is mapped to key’s value.
Non-Recursive In place but needs extra space Stable O(n)
counter_nums = [0]*(upper_bound-lower_bound+1) for item in nums: counter_nums[item-lower_bound] += 1 pos = 0 for idx, item in enumerate(counter_nums): num = idx + lower_bound for i in range(item): nums[pos] = num pos += 1 if i_lower_bound < 0: lb = abs(i_lower_bound) nums = [item - lb for item in nums] return nums
I would also like to mention Radix sort, which can be implemented with count sort/bucket sort as subroutine. See below link for reference.
Counting Sort & Radix Sort
In this blog we will go through two common non-comparison based sorting algorithms. But before we do that, why do we…
After trying all of these different algorithms, just for curiosity I submitted my solution with default sort method. It was pretty fast with runtime of 152 (ms). Python’s default sort uses Tim Sort, which is combination of both merge sort and insertion sort. Implementation of that will be covered in another article.
Found an amazing playlist where sorting algorithms are demonstrated with dance. You may want to watch it. Its worth the time.
With our little experiment we came to know of different algorithms, their runtimes and memory consumption. We now know the parameters like memory, CPU time and stability of the algorithm. We need to evaluate these parameters against the given problem to zero in on an algorithm. We can also combine these base sorting algorithm into more powerful one like Tim Sort.
Happy sorting!!
Sign up for The Variable
By Towards Data Science
Every Thursday, the Variable delivers the very best of Towards Data Science: from hands-on tutorials and cutting-edge research to original features you don't want to miss. Take a look.
By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.
No comments:
Post a Comment