Tim Sort in Python
In Python, Tim Sort is a hybrid sorting algorithm that combines the advantages and effectiveness of both Insertion Sort and Merge Sort.
In Python, the Tim Sort algorithm breaks down arrays into several sub-arrays known as Runs, which are pre-sorted. A completely sorted array is generated by merging all elements from these smaller sorted runs.
How Tim Sort Works?
Let us see an example to deep dive into Tim Sort:
Let's consider the array arr = {18, 2, 1, 12, 5, 3, 8, 11, 7} and analyze it through five distinct steps:
Step 1: Determine the Run Size
- In this stage, the dimensions of the run are established.
- Minimum run size: 32 (This minimum threshold is not significant in this case due to the small array size)
Step 2: Identify Runs
Divide the array into multiple sub-arrays known as Runs. Let us see these runs gradually:
- Run 1 = The descending sub-array [18, 2, 1] is reversed to [1, 2, 18] by Tim Sort.
- Run 2 = [12, 5, 3] is also descending, so Tim will reverse this to [3, 5, 12].
- Run 3 = [8,11] is already in ascending order, so it will remain unchanged.
- Run 4 = [7] is a single-element run and remains as is.
Step 3: Merge the Runs
In this step, we will use the modified merge sort method to merge all runs. The entire array is sorted by merging all the runs.
- Merge [1, 2, 18] and [3, 5, 12], which will result in à [1, 2, 3, 5, 12, 18]
- Merge [8, 11] and [7] to [7, 8, 11]
- Now we will do the final merge: [1, 2, 3, 5, 12, 18] + [7, 8, 11] à[1, 2, 3, 5, 7, 8, 11, 12, 18].
- Final Sorted Array is: [1, 2, 3, 5, 7, 8, 11, 12, 18]
Step 4: Modifying the Run Size
- It is necessary to increase the run size twofold following each merge operation.
- This adjustment is unnecessary since the array is small in this case.
Step 5: Continue Merging
- The merging process will repeat until the entire array is sorted.
- The Final merged run: [1, 2, 3, 5, 7, 8, 11, 12, 18]
- So, we got the final sorted array: [1, 2, 3, 5, 7, 8, 11, 12, 18]
Pseudocode of Tim Sort
Let's examine the Pseudocode for Tim Sort in Python to gain a deeper understanding of this subject:
Example
function TIM_SORT(array):
n = length(array)
MIN_MERGE = 32
# Step 1: Find the minimum run size
minRun = calcMinRun(n)
# Step 2: Using the Insertion Sort method to sort the small runs
for start in range(0, n, minRun):
end = min(start + minRun - 1, n - 1)
insertionSort(array, start, end)
# Step 3: Merge sorted runs using the Merge
size = minRun
while size < n:
for left from 0 to n-1 in steps of 2*size:
mid = min(left + size - 1, n - 1)
right = min(left + 2*size - 1, n - 1)
if mid < right:
merge(array, left, mid, right)
# Here we are doubling the size of the merged run
size = size * 2
Explanation
In the Pseudocode provided, a minimum run size of 32 has been established. The function calcMinRun is employed to determine the minimum run size for the array. Following this, the smaller runs are sorted through the Insertion sort method, and these sorted sub-arrays are subsequently merged until the complete array is organized.
Implementation of Tim Sort in Python
In the following example, we will demonstrate the use of Tim Sort in Python:
Example
# Implementation of Tim Sort in Python
MIN_MERGE = 32
def calcMinRun(n):
"""It returns the minimum run length
(between 23 and 64) such that dividing
the array length by this value results
in a number less than or equal to a power of two.
"""
r = 0
while n >= MIN_MERGE:
r |= n & 1
n >>= 1
return n + r
#this sorts the array for left-to-right index
def insertionSort(arr, left, right):
for i in range(left + 1, right + 1):
j = i
while j > left and arr[j] < arr[j - 1]:
arr[j], arr[j - 1] = arr[j - 1], arr[j]
j -= 1
# The merge function merges the sorted runs
def merge(arr, l, m, r):
# here, the array is divided into two parts
# left and right array
len1, len2 = m - l + 1, r - m
left, right = [], []
for i in range(0, len1):
left.append(arr[l + i])
for i in range(0, len2):
right.append(arr[m + 1 + i])
i, j, k = 0, 0, l
# merging two arrays into one after comparing
while i < len1 and j < len2:
if left[i] <= right[j]:
arr[k] = left[i]
i += 1
else:
arr[k] = right[j]
j += 1
k += 1
# Copy remaining elements of left, if any
while i < len1:
arr[k] = left[i]
k += 1
i += 1
# Copy remaining element of the right, if any
while j < len2:
arr[k] = right[j]
k += 1
j += 1
# Iterative Timsort function to sort the array
def timSort(arr):
n = len(arr)
minRun = calcMinRun(n)
# Sort individual subarrays of size RUN
for start in range(0, n, minRun):
end = min(start + minRun - 1, n - 1)
insertionSort(arr, start, end)
size = minRun
while size < n:
# Pick the starting point of the left subarray. We
# are going to merge arr[left..left+size-1]
# and arr[left+size, left+2*size-1]
# After every merge, we increase the left by 2*size
for left in range(0, n, 2 * size):
# Find the ending point of the left sub-array
# mid+1 is the starting point of the right subarray
mid = min(n - 1, left + size - 1)
right = min((left + 2 * size - 1), (n - 1))
# Merge sub array arr[left.....mid] &
# arr[mid+1....right]
if mid < right:
merge(arr, left, mid, right)
size = 2 * size
# Driver program to test the above function
if __name__ == "__main__":
array = [18, 2, 1, 12, 5, 3, 8, 11, 7]
print("Given Array is:")
print(array)
# Function Call
timSort(array)
print("After Sorting, the Array is:")
print(array)
Output:
Given Array is
[18, 2, 1, 12, 5, 3, 8, 11, 7]
After Sorting the Array is
[1, 2, 3, 5, 7, 8, 11, 12, 18]
Explanation:
In the code provided, we set the minimum run size to 32. Subsequently, we determine the optimal minimum run size for the specified array through the calcMinRun function. Next, the array is segmented into smaller runs, which are sorted utilizing the Insertion Sort algorithm. Ultimately, these sorted sub-arrays are merged iteratively via the merge function until the whole array is organized.
Complexity Analysis of Tim Sort in Python
In Python, Tim Sort encompasses both Time Complexity and Space Complexity:
Time Complexity
Best-Case Time Complexity:
When the array is either sorted or close to being sorted, this scenario is referred to as the Best-case time complexity, which is represented as O(n).
Average-case time Complexity:
The average-case time complexity is observed when the elements within the array are organized in a random manner. For Tim Sort in Python, the average case time complexity is O(n log n).
Space Complexity: Tim Sort has a Space Complexity of O(n) because it requires additional memory for the merging process.
Conclusion
Tim Sort consists of a combination of two sorting algorithms: Merge Sort and Insertion Sort. It segments the arrays into several sub-arrays known as Runs that are pre-sorted. Subsequently, the elements are gathered and merged into a unified sorted list using an algorithm that traverses the array. We studied the complexity analysis, encompassing both Time Complexity and Space Complexity.
Frequently Asked Questions (FAQs)
1. What is Tim Sort in Python?
In Python, Tim Sort is a combined sorting algorithm that integrates the advantages and effectiveness of both Insertion Sort and Merge Sort.
2. What's the meaning of Tim Sort in Python?
The Tim Sort algorithm utilized in Python is named in honor of Tim Peters, the individual who developed it in 2002 for the CPython framework.
3. What is the time complexity of Tim Sort in Python?
Tim Sort has both Best-case and Average-case Time Complexity. The Best-case time complexity is O(n), while the Average-case time complexity is O(n log n).
4. Is there any Worst-case time complexity in Tim Sort?
We can evaluate the worst-case complexity. The worst-case scenario for TimSort is O(n log n), which is significant, particularly for ensuring performance assurances.
5. What are "runs" in Tim Sort in Python?
In Python, the Tim Sort algorithm segments arrays into several sub-arrays referred to as Runs, which are pre-sorted. This algorithm utilizes these runs for the purpose of merging them together.
6. How does Tim Sort work in Python?
In Python, the Tim Sort algorithm segments arrays into several sub-arrays known as Runs, which are already in a sorted state. The completely sorted array is generated by combining these sub-arrays, similar to the process used in Merge Sort.