Sorting Algorithms — With Python
Art of Bringing Order
Sorting means putting items in a particular order. That particular order is determined by the comparison property of the elements. In case of integers we say smaller number comes first and bigger number appear later. Arranging items in particular order improves the searching of the element. Hence sorting has heavy usage in computer science.
In this blog we will go through common sorting algorithms. We will see implementation of each in python. To compare their runtime I used Leetcode question on sorting array. The constraint for the question is as follows.
Constraints:
1 <= nums.length <= 50000
-50000 <= nums[i] <= 50000
I solved this problem with all common sorting algorithms. Below are the execution results.
Note: TLE is time limit exceeded.
Sorting means putting items in a particular order. That particular order is determined by the comparison property of the elements. In case of integers we say smaller number comes first and bigger number appear later. Arranging items in particular order improves the searching of the element. Hence sorting has heavy usage in computer science.
In this blog we will go through common sorting algorithms. We will see implementation of each in python. To compare their runtime I used Leetcode question on sorting array. The constraint for the question is as follows.
Constraints:
1 <= nums.length <= 50000
-50000 <= nums[i] <= 50000
I solved this problem with all common sorting algorithms. Below are the execution results.
Note: TLE is time limit exceeded.
Bubble Sort
Simplest sorting algorithm. Iterates over the list, in each iteration it compares elements in pairs and keeps swapping them such that the larger element is moved towards the end of the list.
Non recursive
Stable
In place
O(n²)
def bubbleSort(array):
swapped = False
for i in range(len(array)-1,0,-1):
for j in range(i):
if array[j]>array[j+1]:
array[j], array[j+1] = array[j+1], array[j]
swapped= True
if swapped:
swapped=False
else:
break
return array
Simplest sorting algorithm. Iterates over the list, in each iteration it compares elements in pairs and keeps swapping them such that the larger element is moved towards the end of the list.
Non recursive
Stable
In place
O(n²)
def bubbleSort(array):
swapped = False
for i in range(len(array)-1,0,-1):
for j in range(i):
if array[j]>array[j+1]:
array[j], array[j+1] = array[j+1], array[j]
swapped= True
if swapped:
swapped=False
else:
break
return array
Selection Sort
In this algorithm we create two segments of the list one sorted and other unsorted. We continuously remove the smallest element from the unsorted segment of the list and append it to the sorted segment. We don’t swap intermediate elements. Hence this algorithm sorts the array in minimum number of swaps.
Non recursive
Unstable
In place
O(n²)
def selectionSort(array):
for i in range(len(array)-1):
min_idx = i
for idx in range(i + 1, len(array)-1):
if array[idx] < array[min_idx]:
min_idx = idx
array[i], array[min_idx] = array[min_idx], array[i]
return array
In this algorithm we create two segments of the list one sorted and other unsorted. We continuously remove the smallest element from the unsorted segment of the list and append it to the sorted segment. We don’t swap intermediate elements. Hence this algorithm sorts the array in minimum number of swaps.
Non recursive
Unstable
In place
O(n²)
def selectionSort(array):
for i in range(len(array)-1):
min_idx = i
for idx in range(i + 1, len(array)-1):
if array[idx] < array[min_idx]:
min_idx = idx
array[i], array[min_idx] = array[min_idx], array[i]
return array
Insertion Sort
Like Selection Sort, in this algorithm we segment the list into sorted and unsorted parts. Then we iterate over the unsorted segment, and insert the element from this segment into the correct position in the sorted list.
Non-recursive
Stable
In place
O(n²)
def insertionSort(array):
for i in range(1, len(array)):
key = array[i]
j = i-1
while array[j] > key and j >= 0:
array[j+1] = array[j]
j -= 1
array[j+1] = key
return array
Like Selection Sort, in this algorithm we segment the list into sorted and unsorted parts. Then we iterate over the unsorted segment, and insert the element from this segment into the correct position in the sorted list.
Non-recursive
Stable
In place
O(n²)
def insertionSort(array):
for i in range(1, len(array)):
key = array[i]
j = i-1
while array[j] > key and j >= 0:
array[j+1] = array[j]
j -= 1
array[j+1] = key
return array
Shell Sort
Shell sort is an optimization over insertion sort. This is achieved by repeatedly doing insertion sort on all elements at fixed, decreasing intervals. Last iteration the interval is 1. Here it becomes a regular insertion sort and it guarantees that the array will be sorted. But to note one point is that by the time we do that array is almost sorted, hence the iteration is very fast.
Non-recursive
Stable
In place
0(n²) also depends on interval choice
def shellSort(array):
n = len(array)
k = int(math.log2(n))
interval = 2**k -1
while interval > 0:
for i in range(interval, n):
temp = array[i]
j = i
while j >= interval and array[j - interval] > temp:
array[j] = array[j - interval]
j -= interval
array[j] = temp
k -= 1
interval = 2**k -1
return array
Shell sort is an optimization over insertion sort. This is achieved by repeatedly doing insertion sort on all elements at fixed, decreasing intervals. Last iteration the interval is 1. Here it becomes a regular insertion sort and it guarantees that the array will be sorted. But to note one point is that by the time we do that array is almost sorted, hence the iteration is very fast.
Non-recursive
Stable
In place
0(n²) also depends on interval choice
def shellSort(array):
n = len(array)
k = int(math.log2(n))
interval = 2**k -1
while interval > 0:
for i in range(interval, n):
temp = array[i]
j = i
while j >= interval and array[j - interval] > temp:
array[j] = array[j - interval]
j -= interval
array[j] = temp
k -= 1
interval = 2**k -1
return array
Heap Sort
Like last two previous algorithms we create two segments of the list one sorted and one unsorted. In this we use heap data structure to efficiently get the max element from the unsorted segment of the list. Heapify method uses recursion to get the max element at the top.
Non-recursive
Unstable
In place
O(nlogn)
def heapify(array, n, i):
largest = i
l = 2 * i + 1
r = 2 * i + 2
if l < n and array[i] < array[l]:
largest = l
if r < n and array[largest] < array[r]:
largest = r
if largest != i:
array[i], array[largest] = array[largest], array[i]
heapify(array, n, largest)
def heapSort(array):
n = len(array)
for i in range(n//2, -1, -1):
heapify(array, n, i)
for i in range(n-1, 0, -1):
array[i], array[0] = array[0], array[i]
heapify(array, i, 0)
return array
Like last two previous algorithms we create two segments of the list one sorted and one unsorted. In this we use heap data structure to efficiently get the max element from the unsorted segment of the list. Heapify method uses recursion to get the max element at the top.
Non-recursive
Unstable
In place
O(nlogn)
def heapify(array, n, i):
largest = i
l = 2 * i + 1
r = 2 * i + 2
if l < n and array[i] < array[l]:
largest = l
if r < n and array[largest] < array[r]:
largest = r
if largest != i:
array[i], array[largest] = array[largest], array[i]
heapify(array, n, largest)
def heapSort(array):
n = len(array)
for i in range(n//2, -1, -1):
heapify(array, n, i)
for i in range(n-1, 0, -1):
array[i], array[0] = array[0], array[i]
heapify(array, i, 0)
return array
Merge Sort
This is a divide and conquer algorithm. In this algorithm we split a list in half, and keeps splitting the list by 2 until it only has single element. Then we merge the sorted sorted list. We keep doing this until we get a sorted list with all the elements of the unsorted input list.
Recursive
Stable
Needs extra space
O(nlogn)
def mergeSort(nums):
if len(nums)==1:
return nums
mid = (len(nums)-1) // 2
lst1 = mergeSort(nums[:mid+1])
lst2 = mergeSort(nums[mid+1:])
result = merge(lst1, lst2)
return resultdef merge(lst1, lst2):
lst = []
i = 0
j = 0
while(i<=len(lst1)-1 and j<=len(lst2)-1):
if lst1[i]<lst2[j]:
lst.append(lst1[i])
i+=1
else:
lst.append(lst2[j])
j+=1
if i>len(lst1)-1:
while(j<=len(lst2)-1):
lst.append(lst2[j])
j+=1
else:
while(i<=len(lst1)-1):
lst.append(lst1[i])
i+=1
return lst
This is a divide and conquer algorithm. In this algorithm we split a list in half, and keeps splitting the list by 2 until it only has single element. Then we merge the sorted sorted list. We keep doing this until we get a sorted list with all the elements of the unsorted input list.
Recursive
Stable
Needs extra space
O(nlogn)
def mergeSort(nums):
if len(nums)==1:
return nums
mid = (len(nums)-1) // 2
lst1 = mergeSort(nums[:mid+1])
lst2 = mergeSort(nums[mid+1:])
result = merge(lst1, lst2)
return resultdef merge(lst1, lst2):
lst = []
i = 0
j = 0
while(i<=len(lst1)-1 and j<=len(lst2)-1):
if lst1[i]<lst2[j]:
lst.append(lst1[i])
i+=1
else:
lst.append(lst2[j])
j+=1
if i>len(lst1)-1:
while(j<=len(lst2)-1):
lst.append(lst2[j])
j+=1
else:
while(i<=len(lst1)-1):
lst.append(lst1[i])
i+=1
return lst
Quick Sort
In this algorithm we partition the list around a pivot element, sorting values around the pivot. In my solution I used the the last element from the list as pivot value. Best performance is achieved when the pivot value splits the list in two almost equal halves.
Recursive
In place
Unstable
O(nlogn)
def quickSort(array):
if len(array)> 1:
pivot=array.pop()
grtr_lst, equal_lst, smlr_lst = [], [pivot], []
for item in array:
if item == pivot:
equal_lst.append(item)
elif item > pivot:
grtr_lst.append(item)
else:
smlr_lst.append(item)
return (quickSort(smlr_lst) + equal_lst + quickSort(grtr_lst))
else:
return array
In this algorithm we partition the list around a pivot element, sorting values around the pivot. In my solution I used the the last element from the list as pivot value. Best performance is achieved when the pivot value splits the list in two almost equal halves.
Recursive
In place
Unstable
O(nlogn)
def quickSort(array):
if len(array)> 1:
pivot=array.pop()
grtr_lst, equal_lst, smlr_lst = [], [pivot], []
for item in array:
if item == pivot:
equal_lst.append(item)
elif item > pivot:
grtr_lst.append(item)
else:
smlr_lst.append(item)
return (quickSort(smlr_lst) + equal_lst + quickSort(grtr_lst))
else:
return array
Counting Sort
This algorithm does not do comparison between the elements. We use the mathematical properties of the integers to sort. We count how many time a number has come and store the count in the array where index is mapped to key’s value.
Non-Recursive
In place but needs extra space
Stable
O(n)
def sortArray(self, nums: List[int]) -> List[int]:
i_lower_bound , upper_bound = min(nums), max(nums)
lower_bound = i_lower_bound
if i_lower_bound < 0:
lb = abs(i_lower_bound)
nums = [item + lb for item in nums]
lower_bound , upper_bound = min(nums), max(nums)
counter_nums = [0]*(upper_bound-lower_bound+1)
for item in nums:
counter_nums[item-lower_bound] += 1
pos = 0
for idx, item in enumerate(counter_nums):
num = idx + lower_bound
for i in range(item):
nums[pos] = num
pos += 1
if i_lower_bound < 0:
lb = abs(i_lower_bound)
nums = [item - lb for item in nums]
return nums
I would also like to mention Radix sort, which can be implemented with count sort/bucket sort as subroutine. See below link for reference.
This algorithm does not do comparison between the elements. We use the mathematical properties of the integers to sort. We count how many time a number has come and store the count in the array where index is mapped to key’s value.
Non-Recursive
In place but needs extra space
Stable
O(n)
def sortArray(self, nums: List[int]) -> List[int]:
i_lower_bound , upper_bound = min(nums), max(nums)
lower_bound = i_lower_bound
if i_lower_bound < 0:
lb = abs(i_lower_bound)
nums = [item + lb for item in nums]
lower_bound , upper_bound = min(nums), max(nums)
counter_nums = [0]*(upper_bound-lower_bound+1)
for item in nums:
counter_nums[item-lower_bound] += 1
pos = 0
for idx, item in enumerate(counter_nums):
num = idx + lower_bound
for i in range(item):
nums[pos] = num
pos += 1
if i_lower_bound < 0:
lb = abs(i_lower_bound)
nums = [item - lb for item in nums]
return nums
I would also like to mention Radix sort, which can be implemented with count sort/bucket sort as subroutine. See below link for reference.
Counting Sort & Radix Sort
In this blog we will go through two common non-comparison based sorting algorithms. But before we do that, why do we…
Putting it all together:
class Solution: | |
def sortArray(self, nums: List[int]) -> List[int]: | |
# BUBBLE_SORT | |
def bubbleSort(array): | |
swapped = False | |
for i in range(len(array)-1,0,-1): | |
for j in range(i): | |
if array[j]>array[j+1]: | |
array[j], array[j+1] = array[j+1], array[j] | |
swapped= True | |
if swapped: | |
swapped=False | |
else: | |
break | |
return array | |
# INSERTION_SORT | |
def insertionSort(array): | |
for i in range(1, len(array)): | |
key = array[i] | |
j = i-1 | |
while array[j] > key and j >= 0: | |
array[j+1] = array[j] | |
j -= 1 | |
array[j+1] = key | |
return array | |
# SELECTION_SORT | |
def selectionSort(array): | |
for i in range(len(array)-1): | |
min_idx = i | |
for idx in range(i + 1, len(array)-1): | |
if array[idx] < array[min_idx]: | |
min_idx = idx | |
array[i], array[min_idx] = array[min_idx], array[i] | |
return array | |
# HEAP_SORT | |
def heapify(array, n, i): | |
largest = i | |
l = 2 * i + 1 | |
r = 2 * i + 2 | |
if l < n and array[i] < array[l]: | |
largest = l | |
if r < n and array[largest] < array[r]: | |
largest = r | |
if largest != i: | |
array[i], array[largest] = array[largest], array[i] | |
heapify(array, n, largest) | |
def heapSort(array): | |
n = len(array) | |
for i in range(n//2, -1, -1): | |
heapify(array, n, i) | |
for i in range(n-1, 0, -1): | |
array[i], array[0] = array[0], array[i] | |
heapify(array, i, 0) | |
return array | |
# MERGE_SORT | |
def mergeSort(nums): | |
if len(nums)==1: | |
return nums | |
mid = (len(nums)-1) // 2 | |
lst1 = mergeSort(nums[:mid+1]) | |
lst2 = mergeSort(nums[mid+1:]) | |
result = merge(lst1, lst2) | |
return result | |
def merge(lst1, lst2): | |
lst = [] | |
i = 0 | |
j = 0 | |
while(i<=len(lst1)-1 and j<=len(lst2)-1): | |
if lst1[i]<lst2[j]: | |
lst.append(lst1[i]) | |
i+=1 | |
else: | |
lst.append(lst2[j]) | |
j+=1 | |
if i>len(lst1)-1: | |
while(j<=len(lst2)-1): | |
lst.append(lst2[j]) | |
j+=1 | |
else: | |
while(i<=len(lst1)-1): | |
lst.append(lst1[i]) | |
i+=1 | |
return lst | |
# QUICK_SORT | |
def quickSort(array): | |
if len(array)> 1: | |
pivot=array.pop() | |
grtr_lst, equal_lst, smlr_lst = [], [pivot], [] | |
for item in array: | |
if item == pivot: | |
equal_lst.append(item) | |
elif item > pivot: | |
grtr_lst.append(item) | |
else: | |
smlr_lst.append(item) | |
return (quickSort(smlr_lst) + equal_lst + quickSort(grtr_lst)) | |
else: | |
return array | |
#SHELL_SORT | |
def shellSort(array): | |
n = len(array) | |
interval = n // 2 | |
while interval > 0: | |
for i in range(interval, n): | |
temp = array[i] | |
j = i | |
while j >= interval and array[j - interval] > temp: | |
array[j] = array[j - interval] | |
j -= interval | |
array[j] = temp | |
interval //= 2 | |
return array | |
#nums = bubbleSort(nums) | |
#nums = insertionSort(nums) | |
#nums = selectionSort(nums) | |
#nums = heapSort(nums) | |
#nums = mergeSort(nums) | |
#nums = quickSort(nums) | |
return nums |
After trying all of these different algorithms, just for curiosity I submitted my solution with default sort method. It was pretty fast with runtime of 152 (ms). Python’s default sort uses Tim Sort, which is combination of both merge sort and insertion sort. Implementation of that will be covered in another article.
Found an amazing playlist where sorting algorithms are demonstrated with dance. You may want to watch it. Its worth the time.
With our little experiment we came to know of different algorithms, their runtimes and memory consumption. We now know the parameters like memory, CPU time and stability of the algorithm. We need to evaluate these parameters against the given problem to zero in on an algorithm. We can also combine these base sorting algorithm into more powerful one like Tim Sort.
Happy sorting!!
After trying all of these different algorithms, just for curiosity I submitted my solution with default sort method. It was pretty fast with runtime of 152 (ms). Python’s default sort uses Tim Sort, which is combination of both merge sort and insertion sort. Implementation of that will be covered in another article.
Found an amazing playlist where sorting algorithms are demonstrated with dance. You may want to watch it. Its worth the time.
With our little experiment we came to know of different algorithms, their runtimes and memory consumption. We now know the parameters like memory, CPU time and stability of the algorithm. We need to evaluate these parameters against the given problem to zero in on an algorithm. We can also combine these base sorting algorithm into more powerful one like Tim Sort.
Happy sorting!!
Sign up for The Variable
By Towards Data Science
Every Thursday, the Variable delivers the very best of Towards Data Science: from hands-on tutorials and cutting-edge research to original features you don't want to miss. Take a look.
By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.
Every Thursday, the Variable delivers the very best of Towards Data Science: from hands-on tutorials and cutting-edge research to original features you don't want to miss. Take a look.
By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.
More from Towards Data Science
Your home for data science. A Medium publication sharing concepts, ideas and codes.
Your home for data science. A Medium publication sharing concepts, ideas and codes.
Comments