Heap

# Basic concept

1. Heap is data structure organized by a binary tree in which the value of a parent node is greater than its children (max heap).
2. Heap in Python is implemented with the heapq module including the following functionalities
1. heapq.heappush(heap, item): push an item into heap
2. heapq.heappop(heap): pop an item from heap
3. heapq.pushpop(heap, item): push an item into heap and pop an item from heap
4. heapq.heapify(list): transform a list into heap
5. heapq.heapreplace(heap, tiem): pop an item from heap and push an new item from heap
3. Time complexity of a binary heap
1. Insertion: O(logN)
2. Deletion of min: O(logN)
3. Remove: O(logN)
4. findMin: O(1)
4. Time Complexity compare to other data structures

Structure Insertion Deletion of Min Remove findMin sort
heap O(logN) O(logN) O(logN) O(1) O(NlogN)
orderedarray O(N) O(1) O(1) O(N) O(N)
unordered array O(1) O(N) O(1) O(N) O(NlogN)
BST O(logN) O(logN) O(logN) O(logN) O(N)

# Related coding exercises

## Kth Largest Element in an Array

1. Kth Largest Element in an Array problem is defined as to find the kth largest element in an unsorted array. Note that it is the kth largest element in the sorted order, not the kth distinct element.
2. The problem can be approached with a heap data structure where
1. First to construct a heap array with all numbers in O(nlogn) time
2. Then pop k times, each pop operation uses O(1) time
3. An example Python solution using heap and internal sort is given as the following
class Solution:
# @param {integer[]} nums
# @param {integer} k
# @return {integer}
def findKthLargest(self, nums, k):
'''
'''
#return solution_sort(nums,k)
return solution_heap(nums,k)
import heapq
def solution_heap(nums,k):
h = []
for num in nums:
h.append(-num)
heapq.heapify(h)
print h
res = []
for i in xrange(k):
res = (heapq.heappop(h))
print res
return -res
pass
def solution_sort(nums,k):
'''
use build-in sort
'''
if len(nums) == 0: return None
nums.sort()
return nums[len(nums)-k]
pass


## Sliding Window Maximum

1. Sliding Window Maximum problem is defined as given an array nums, there is a sliding window of size k which is moving from the very left of the array to the very right. You can only see the k numbers in the window. Each time the sliding window moves right by one position.
2. heap approach: add element into a heap one by another, in each step
1. Pop an element from heap until the index is within the k different from the current index
2. The solution roughly takes O(nlogn) time
3. ‘deque’ approach:
1. Maintain a deque structure
2. In each iteration
1. if the deque is empty, add current number into the head of deque
2. pop the element from the tail of the queue when its index is out of range
4. An example Python solution is given as the following
class Solution:
# @param {integer[]} nums
# @param {integer} k
# @return {integer[]}
def maxSlidingWindow(self, nums, k):
return solution(nums,k)
def solution(nums,k):
'''
29
'''
if len(nums) == 0: return nums
h = []
res = []
for i in xrange(k):
heapq.heappush( h, (-nums[i],i) )
(val,i) = heapq.heappop(h)
res.append(-val)
heapq.heappush( h, (-nums[i],i) )
print res
for j in xrange(k,len(nums)):
heapq.heappush( h, (-nums[j],j) )
(val,i) = heapq.heappop(h)
while i<j-k+1:
(val,i) = heapq.heappop(h)
res.append(-val)
heapq.heappush( h,(-nums[i],i) )
return res


## Merge k Sorted Lists

1. Merge k Sorted Lists problem is to merge k sorted linked lists and return it as one sorted list.
2. The problem can be solved by maintaining a heap with $k$ elements, and in each step
1. pop an element from the heap
2. add one element from the corresponding list into heap
3. An example Python solution is given as the following
# Definition for singly-linked list.
# class ListNode:
#     def __init__(self, x):
#         self.val = x
#         self.next = None
class Solution:
# @param {ListNode[]} lists
# @return {ListNode}
def mergeKLists(self, lists):
return solution(lists)
def solution(lists):
if len(lists) == 0: return None
if len(lists) == 1: return lists
res = []
h = []
for i in xrange(len(lists)):
if lists[i] != None:
heapq.heappush(h, (lists[i].val,i) )
lists[i] = lists[i].next
while len(h)>0:
(val,i) = heapq.heappop(h)
res.append(val)
if lists[i] != None:
heapq.heappush(h, (lists[i].val,i))
lists[i] = lists[i].next
if len(res) == 0: return None
print res
for i in xrange(1,len(res)):
tmp = ListNode(res[i])
p.next = tmp
p=p.next


## The Skyline Problem

1. The goal of The Skyline Problem is to find a contour formed by a collection of buildings.
2. The problem can also be solve with a heap structure in O(nlogn) time where n is the number of buildings from the input.
3. In particular, the solution will keep a active building together with a heap of buildings and in each iteration
1. it takes in a new building
2. if the new building is overlap with the active building, the algorithm will either put it into the heap if it is not as high as the active one or use it as a active building and push the previous active building into the heap
3. if the new building is not overlap with the active building, the algorithm will work out the contour of the building in the heap.
4. An example Python solution is given as the following
class Solution:
# @param {integer[][]} buildings
# @return {integer[][]}
def getSkyline(self, buildings):
if len(buildings) == 0: return []
res = []
heap = []
activebuilding = ( buildings,buildings,buildings )
res.append( [activebuilding,activebuilding] )
i=0
while i<len(buildings)-1:
i+=1
currentbuilding = (buildings[i],buildings[i],buildings[i])
if len(buildings) < 10: print i,heap,res,activebuilding,currentbuilding
if currentbuilding > activebuilding:
# current start on the right
# process current heap
if len(heap) == 0:
res.append( [activebuilding,0] )
activebuilding = currentbuilding
res.append( [activebuilding,activebuilding] )
else:
while len(heap) > 0:
(a,b,c) = heapq.heappop(heap)
newactivebuilding = (-a,b,c)
if newactivebuilding > activebuilding:
break
if newactivebuilding == None:
res.append( [activebuilding,0] )
activebuilding = currentbuilding
res.append( [activebuilding,activebuilding] )
else:
res.append( [activebuilding,newactivebuilding] )
activebuilding = newactivebuilding
i-=1
elif currentbuilding < activebuilding:
# current start on the left
if currentbuilding>activebuilding:
heapq.heappush(heap,(-activebuilding,activebuilding,activebuilding))
activebuilding = currentbuilding
print '--', activebuilding
res.append([activebuilding,activebuilding])
else:
heapq.heappush(heap,(-currentbuilding,currentbuilding,currentbuilding))
elif currentbuilding == activebuilding:
# current start on the same position
if currentbuilding >= activebuilding:
activebuilding = currentbuilding
else:
heapq.heappush(heap,(-currentbuilding,currentbuilding,currentbuilding))
(a,b,c) = heapq.heappop(heap)
currentbuilding = (-a,b,c)
res.append([activebuilding,currentbuilding])
activebuilding = currentbuilding
if len(buildings) < 10: print 'heap', heap,res
# no more element
if len(heap) == 0:
res.append( [activebuilding,0] )
while len(heap) > 0:
if len(buildings) < 10:print heap
(a,b,c) = heapq.heappop(heap)
currentbuilding = (-a,b,c)
while len(heap) >0 and currentbuilding <= activebuilding:
(a,b,c) = heapq.heappop(heap)
currentbuilding = (-a,b,c)
if currentbuilding > activebuilding:
res.append( [activebuilding,currentbuilding] )
activebuilding = currentbuilding
res.append([activebuilding,0])
# process res
newres = []
for i in xrange(len(res)-1):
if res[i] != res[i+1]:
newres.append(res[i])
newres.append(res[len(res)-1])
print 'newres',newres
return newres

03 August 2015