Sorting Algorithms — With Python

 

Art of Bringing Order

Sorting means putting items in a particular order. That particular order is determined by the comparison property of the elements. In case of integers we say smaller number comes first and bigger number appear later. Arranging items in particular order improves the searching of the element. Hence sorting has heavy usage in computer science.

In this blog we will go through common sorting algorithms. We will see implementation of each in python. To compare their runtime I used Leetcode question on sorting array. The constraint for the question is as follows.

Constraints:

1 <= nums.length <= 50000
-50000 <= nums[i] <= 50000

I solved this problem with all common sorting algorithms. Below are the execution results.

Image by Author

Note: TLE is time limit exceeded.

Bubble Sort

Simplest sorting algorithm. Iterates over the list, in each iteration it compares elements in pairs and keeps swapping them such that the larger element is moved towards the end of the list.

Non recursive
Stable
In place
O(n²)

def bubbleSort(array):
swapped = False
for i in range(len(array)-1,0,-1):
for j in range(i):
if array[j]>array[j+1]:
array[j], array[j+1] = array[j+1], array[j]
swapped= True
if swapped:
swapped=False
else:
break
return array

Selection Sort

In this algorithm we create two segments of the list one sorted and other unsorted. We continuously remove the smallest element from the unsorted segment of the list and append it to the sorted segment. We don’t swap intermediate elements. Hence this algorithm sorts the array in minimum number of swaps.

Non recursive
Unstable
In place
O(n²)

def selectionSort(array):
for i in range(len(array)-1):
min_idx = i
for idx in range(i + 1, len(array)-1):
if array[idx] < array[min_idx]:
min_idx = idx
array[i], array[min_idx] = array[min_idx], array[i]
return array

Insertion Sort

Like Selection Sort, in this algorithm we segment the list into sorted and unsorted parts. Then we iterate over the unsorted segment, and insert the element from this segment into the correct position in the sorted list.

Non-recursive
Stable
In place
O(n²)

def insertionSort(array):
for i in range(1, len(array)):
key = array[i]
j = i-1
while array[j] > key and j >= 0:
array[j+1] = array[j]
j -= 1
array[j+1] = key
return array

Shell Sort

Shell sort is an optimization over insertion sort. This is achieved by repeatedly doing insertion sort on all elements at fixed, decreasing intervals. Last iteration the interval is 1. Here it becomes a regular insertion sort and it guarantees that the array will be sorted. But to note one point is that by the time we do that array is almost sorted, hence the iteration is very fast.

Non-recursive
Stable
In place
0(n²) also depends on interval choice

def shellSort(array):
n = len(array)
k = int(math.log2(n))
interval = 2**k -1
while interval > 0:
for i in range(interval, n):
temp = array[i]
j = i
while j >= interval and array[j - interval] > temp:
array[j] = array[j - interval]
j -= interval
array[j] = temp
k -= 1
interval = 2**k -1
return array

Heap Sort

Like last two previous algorithms we create two segments of the list one sorted and one unsorted. In this we use heap data structure to efficiently get the max element from the unsorted segment of the list. Heapify method uses recursion to get the max element at the top.

Non-recursive
Unstable
In place
O(nlogn)

def heapify(array, n, i):
largest = i
l = 2 * i + 1
r = 2 * i + 2

if l < n and array[i] < array[l]:
largest = l
if r < n and array[largest] < array[r]:
largest = r

if largest != i:
array[i], array[largest] = array[largest], array[i]
heapify(array, n, largest)

def heapSort(array):
n = len(array)
for i in range(n//2, -1, -1):
heapify(array, n, i)
for i in range(n-1, 0, -1):
array[i], array[0] = array[0], array[i]
heapify(array, i, 0)
return array

Merge Sort

This is a divide and conquer algorithm. In this algorithm we split a list in half, and keeps splitting the list by 2 until it only has single element. Then we merge the sorted sorted list. We keep doing this until we get a sorted list with all the elements of the unsorted input list.

Recursive
Stable
Needs extra space
O(nlogn)

def mergeSort(nums):
if len(nums)==1:
return nums
mid = (len(nums)-1) // 2
lst1 = mergeSort(nums[:mid+1])
lst2 = mergeSort(nums[mid+1:])
result = merge(lst1, lst2)
return result
def merge(lst1, lst2):
lst = []
i = 0
j = 0
while(i<=len(lst1)-1 and j<=len(lst2)-1):
if lst1[i]<lst2[j]:
lst.append(lst1[i])
i+=1
else:
lst.append(lst2[j])
j+=1
if i>len(lst1)-1:
while(j<=len(lst2)-1):
lst.append(lst2[j])
j+=1
else:
while(i<=len(lst1)-1):
lst.append(lst1[i])
i+=1
return lst

Quick Sort

In this algorithm we partition the list around a pivot element, sorting values around the pivot. In my solution I used the the last element from the list as pivot value. Best performance is achieved when the pivot value splits the list in two almost equal halves.

Recursive
In place
Unstable
O(nlogn)

def quickSort(array):
if len(array)> 1:
pivot=array.pop()
grtr_lst, equal_lst, smlr_lst = [], [pivot], []
for item in array:
if item == pivot:
equal_lst.append(item)
elif item > pivot:
grtr_lst.append(item)
else:
smlr_lst.append(item)
return (quickSort(smlr_lst) + equal_lst + quickSort(grtr_lst))
else:
return array

Counting Sort

This algorithm does not do comparison between the elements. We use the mathematical properties of the integers to sort. We count how many time a number has come and store the count in the array where index is mapped to key’s value.

Non-Recursive
In place but needs extra space
Stable
O(n)

def sortArray(self, nums: List[int]) -> List[int]:
i_lower_bound , upper_bound = min(nums), max(nums)
lower_bound = i_lower_bound
if i_lower_bound < 0:
lb = abs(i_lower_bound)
nums = [item + lb for item in nums]
lower_bound , upper_bound = min(nums), max(nums)

counter_nums = [0]*(upper_bound-lower_bound+1)
for item in nums:
counter_nums[item-lower_bound] += 1
pos = 0
for idx, item in enumerate(counter_nums):
num = idx + lower_bound
for i in range(item):
nums[pos] = num
pos += 1
if i_lower_bound < 0:
lb = abs(i_lower_bound)
nums = [item - lb for item in nums]
return nums

I would also like to mention Radix sort, which can be implemented with count sort/bucket sort as subroutine. See below link for reference.

Counting Sort & Radix Sort

In this blog we will go through two common non-comparison based sorting algorithms. But before we do that, why do we…

Putting it all together:

class Solution:
def sortArray(self, nums: List[int]) -> List[int]:
# BUBBLE_SORT
def bubbleSort(array):
swapped = False
for i in range(len(array)-1,0,-1):
for j in range(i):
if array[j]>array[j+1]:
array[j], array[j+1] = array[j+1], array[j]
swapped= True
if swapped:
swapped=False
else:
break
return array
# INSERTION_SORT
def insertionSort(array):
for i in range(1, len(array)):
key = array[i]
j = i-1
while array[j] > key and j >= 0:
array[j+1] = array[j]
j -= 1
array[j+1] = key
return array
# SELECTION_SORT
def selectionSort(array):
for i in range(len(array)-1):
min_idx = i
for idx in range(i + 1, len(array)-1):
if array[idx] < array[min_idx]:
min_idx = idx
array[i], array[min_idx] = array[min_idx], array[i]
return array
# HEAP_SORT
def heapify(array, n, i):
largest = i
l = 2 * i + 1
r = 2 * i + 2
if l < n and array[i] < array[l]:
largest = l
if r < n and array[largest] < array[r]:
largest = r
if largest != i:
array[i], array[largest] = array[largest], array[i]
heapify(array, n, largest)
def heapSort(array):
n = len(array)
for i in range(n//2, -1, -1):
heapify(array, n, i)
for i in range(n-1, 0, -1):
array[i], array[0] = array[0], array[i]
heapify(array, i, 0)
return array
# MERGE_SORT
def mergeSort(nums):
if len(nums)==1:
return nums
mid = (len(nums)-1) // 2
lst1 = mergeSort(nums[:mid+1])
lst2 = mergeSort(nums[mid+1:])
result = merge(lst1, lst2)
return result
def merge(lst1, lst2):
lst = []
i = 0
j = 0
while(i<=len(lst1)-1 and j<=len(lst2)-1):
if lst1[i]<lst2[j]:
lst.append(lst1[i])
i+=1
else:
lst.append(lst2[j])
j+=1
if i>len(lst1)-1:
while(j<=len(lst2)-1):
lst.append(lst2[j])
j+=1
else:
while(i<=len(lst1)-1):
lst.append(lst1[i])
i+=1
return lst
# QUICK_SORT
def quickSort(array):
if len(array)> 1:
pivot=array.pop()
grtr_lst, equal_lst, smlr_lst = [], [pivot], []
for item in array:
if item == pivot:
equal_lst.append(item)
elif item > pivot:
grtr_lst.append(item)
else:
smlr_lst.append(item)
return (quickSort(smlr_lst) + equal_lst + quickSort(grtr_lst))
else:
return array
#SHELL_SORT
def shellSort(array):
n = len(array)
interval = n // 2
while interval > 0:
for i in range(interval, n):
temp = array[i]
j = i
while j >= interval and array[j - interval] > temp:
array[j] = array[j - interval]
j -= interval
array[j] = temp
interval //= 2
return array
#nums = bubbleSort(nums)
#nums = insertionSort(nums)
#nums = selectionSort(nums)
#nums = heapSort(nums)
#nums = mergeSort(nums)
#nums = quickSort(nums)
return nums


After trying all of these different algorithms, just for curiosity I submitted my solution with default sort method. It was pretty fast with runtime of 152 (ms). Python’s default sort uses Tim Sort, which is combination of both merge sort and insertion sort. Implementation of that will be covered in another article.

Found an amazing playlist where sorting algorithms are demonstrated with dance. You may want to watch it. Its worth the time.

Courtesy: https://www.youtube.com/channel/UCIqiLefbVHsOAXDAxQJH7Xw

With our little experiment we came to know of different algorithms, their runtimes and memory consumption. We now know the parameters like memory, CPU time and stability of the algorithm. We need to evaluate these parameters against the given problem to zero in on an algorithm. We can also combine these base sorting algorithm into more powerful one like Tim Sort.

Happy sorting!!

Sign up for The Variable

By Towards Data Science

Every Thursday, the Variable delivers the very best of Towards Data Science: from hands-on tutorials and cutting-edge research to original features you don't want to miss. Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

More from Towards Data Science

Your home for data science. A Medium publication sharing concepts, ideas and codes.

Comments

Popular posts from this blog

Flutter for Single-Page Scrollable Websites with Navigator 2.0

A Data Science Portfolio is More Valuable than a Resume

Better File Storage in Oracle Cloud