Heap Sort in Python
Heap sort closely resembles selection sort, as it involves identifying the maximum element and placing it at the end of the list. This algorithm is grounded in a comparison sorting method that utilizes the Binary heap data structure. It serves as an excellent illustration of an effective sorting algorithm.
What is Heap Sort?
Heap sort is a widely-used and effective sorting technique. The principle behind heap sort involves "removing" elements from the heap section of the array individually and placing them into the sorted section of the array. Prior to delving deeper into the heap sorting algorithm, it is important to understand the heap data structure.
This algorithm operates in-place, indicating that a constant amount of memory is utilized for the sorted list, and the memory requirement is independent of the initial list's size.
For instance, there is no requirement for an extra memory stack to hold the sorted array, nor is there a need for a recursive call stack. The heapsort algorithm is typically executed using a secondary array to arrange the constant values. This method is efficient, straightforward, intuitive, and easy to implement.
Conversely, heap sort is classified as unstable, indicating that it does not preserve the relative order of elements that have identical values. While it efficiently sorts primitive data types like integers and characters, it encounters difficulties when dealing with complex types and objects.
Let's understand it by the following example -
We possess a custom class named Student that includes attributes for age and name. Within an array, there exist multiple instances of this class, among which is a student named "Thomas" who is aged "20," alongside "Peter," who also has the age of 20, maintaining the same sequence.
When we arrange the array of individuals in order of age, there is no assurance that "Thomas" will precede "Peter" in the resulting sorted array. Although a specific order can be established, certainty is lacking.
Heap Data Structure
The heap data structure is a complete binary tree that satisfies the heap property. It is commonly referred to as the binary heap.
A complete binary tree adheres to these characteristics.
- Each level must be fully populated.
- All nodes are positioned as far to the left as feasible.
The image of the heap displayed above illustrates that it is not sorted. This article will not explore the heap in detail, as our primary aim is to elucidate the Heap sort algorithm rather than the heap itself. In the context of heap sort, the smallest element is consistently positioned as the first element.
A heap tree can be categorized into two varieties: min-heap and max-heap. A min-heap maintains a record of the smallest element, while a max-heap keeps track of the highest element. The heap primarily facilitates the following operations: deleteminimum, getminimum, and add.
The initial element of the heap can be removed after it has been restored. This operation requires O(log N) time, which is quite efficient.
Implementation
Python provides the in-built functions for sorting elements using heap sort. The functions are given below.
- heappush(list, item) - It is used to add the heap element and re-sort it.
- heappop(list) - It is used to remove the element and return the element.
- heapfy - It is used to turn the given list into a heap.
Python Heap Sort Example
Let us consider an example to illustrate heap sort implemented in Python.
from heapq import heappop, heappush
def heapsort(list1):
heap = []
for ele in list1:
heappush(heap, ele)
sort = []
# the elements are lift in the heap
while heap:
sort.append(heappop(heap))
return sort
list1 = [27, 21, 55, 15, 60, 4, 11, 17, 2, 87]
print(heapsort(list1))
Output:
[2, 4, 11, 15, 17, 21, 27, 55, 60, 87]
Explanation
In the code presented above, the heapq module has been imported, which includes the heappop and heappush functions. We defined the Heapsort method, which accepts list1 as a parameter. A for loop was employed to traverse list1 and insert the elements into the initially empty heap. Subsequently, a while loop was utilized to extract sorted elements and append them to the empty sort.
We invoked the Heapsort function and provided it with a list. The function then returned the list in a sorted order.
Sorting Custom Objects
Heap sort is effective for sorting predefined data types; however, it becomes more complex when dealing with user-defined data types, like class objects. In this section, we will focus on sorting these custom objects.
Our implementation relies on the built-in functions available in Python. The following methods are provided:
- heapq.nlargest(n, iterable, *key = None) - This function retrieves a list containing the n largest elements from the given dataset as specified by the iterable.
- heapq.nsmallest(n, iterable, *key = None) - This function returns a list of the n smallest elements from the specified dataset, defined by the iterable.
Python Example for Sorting Custom Objects
Consider an illustration that showcases the method of sorting custom objects in Python.
from heapq import heappop, heappush
class Car:
def __init__(self, model, year):
self.model = model
self.year = year
def __str__(self):
return str.format("Model Name: {}, Year: {}", self.model, self.year)
def __lt__(self, other):
return self.year < other.year
def __gt__(self, other):
return other.__lt__(self)
def __eq__(self, other):
return self.year == other.year
def __ne__(self, other):
return not self.__eq__(other)
def heapsort(list1):
heap = []
for element in list1:
heappush(heap, element)
ordered = []
while heap:
ordered.append(heappop(heap))
return ordered
car1 = Car("Renault", 2001)
car2 = Car("Bentley", 2005)
car3 = Car("Kia", 2014)
car4 = Car("Maruti Suzuki", 1999);
car5 = Car("Nano", 2012)
list1 = [car1, car2, car3, car4, car5]
for c in Heapsort Heapsort (list1):
print(c)
Output:
Model Name: Maruti Suzuki, Year: 1999
Model Name: Renault, Year: 2001
Model Name: Bentley, Year: 2005
Model Name: Nano, Year: 2012
Model Name: Kia, Year: 2014
We have sorted the objects on the year base.
Comparison between Heap sort and Other Algorithm
A widely utilized quick sort algorithm is known for its efficiency; however, heap sort is often favored due to its dependability. The primary advantage of heap sort lies in its time complexity, which has an upper limit of O(nlogn).
The heap sort algorithm has a time complexity of O(nlogn) for both average and worst-case situations, whereas the quick sort performs approximately 20% more efficiently in the average case.
The quick sort algorithm can exhibit reduced performance under certain predictable conditions. There exists a risk of a security vulnerability in quick sort, as the unfavorable O(n2) case can be readily activated.
We will now examine the Merge sort, which has a time complexity comparable to that of the heap sort.
Merge sort exhibits greater stability and is inherently more amenable to parallelization, in contrast to heap sort, which lacks these benefits.
Moreover, in the majority of situations, Merge sort exhibits superior speed compared to Heap Sort, despite both algorithms having identical time complexity.
In comparison, Heapsort can be executed more efficiently in-place than Merge sort can.
Conclusion
Heapsort may not be as widely favored or as swift as other sorting methods, yet it offers greater predictability compared to alternative algorithms. This sorting technique is often chosen when memory usage and security considerations are paramount.
It can be rapidly executed using Python. We need to add the elements into a heap and then extract them.