Wednesday, December 29, 2010

C++ Min-Max Heap Implementation

Update: See the GitHub Repo for the latest version of the code!

I wasted a decent amount of time yesterday...

I've recently come across the need for a double ended priority queue (where you can pop both the min and the max efficiently) and I have been fervently searching for the best way to go about implementing one. My initial searches brought me to a paper written in October of 1986, which described a novel data structure called a Min-Max Heap.

Given a set S of values, a min-max heap on S is a binary tree T with the following properties:
  1. T has the heap-shape
  2. T is min-max ordered: values stored at nodes on even (odd) levels are smaller (greater) than or equal to the values stored at their descendants (if any) where the root is at level zero. Thus, the smallest value of S is stored at the root of T, whereas the largest value is stored at one of the root’s children; an example of a min-max heap is shown in Figure 1 (p. 998).

Below is a diagram I created to illustrate a Min-Max Heap. You may notice that the ordering has been reversed, namely the even levels are max levels and the odd levels are min levels.

Figuring that a Min-Max Heap was perfect for my needs, I went ahead and implemented the data structure. However, after I finished, another data structure was brought to my attention: an Interval Heap. I have run across Interval Heaps before in my travels but only now do I recall that they too have the properties I want. I have also found a number of articles telling me that Interval Heaps are the best and Min-Max Heaps are the worst (as far as real time performance goes).

So now I need to create an implementation of an Interval Heap, not to mention I have this grand Min-Max Heap implementation that will surely never find use. I figure I should put it up on the Internet in the hopes that someone will find use for it though...

What exactly does a Min-Max Heap offer a programmer? It allows constant time lookup of its min and max values; logarithmic time insertion; logarithmic time deletion of min and max values; and linear time creation. Really quite a powerful structure. It is known to be about twice as slow as an interval heap in practice though (one article postulated that this was due to the higher likelihood of cache misses).

The reason I needed a structure with these properties was that I wanted to create a Stunted Priority Queue (a term I just now coined, not in any way an accepted term), or a Priority Queue that would allow me to set a maximum size, and it would never exceed that size. If an insertion made the queue exceed that size, the smallest value would be popped off to compensate.

I have created a very solid implementation of a Min-Max Heap, however, I have not included support for linear time creation. The only way to build the heap is to use the push function, which I believe gives you O(n ln(n)) time. If you would like support for that you will have to add it yourself, though if you do, please send me the modified code so I can update this site.

As always, please email me or post here if you find any problems with the code, or if any part of it is confusing so I may fix it.

I'd also like to hear comments on some of my implementation choices. Such as using a zero-indexed heap rather than a one-indexed heap (which would require an extra sizeof(T) memory). Or using template parameters rather than function parameters in my trickleUp() and trickleDown() functions.

11 comments:

  1. Very interesting. What do you plan on doing with this double ended priority queue?

    ReplyDelete
  2. I've designed an algorithm that looks a lot like Dijkstra's algorithm (pushes all possible moves onto a priority queue and then always takes the best move). But I'm running the algorithm on a graph that has many millions of nodes, so the queue grows very large. I've found that if I limit the number of moves stored to say, 100, the algorithm will still complete with a near-perfect result.
    That is why I need a Stunted Priority Queue (which was described in the post).

    ReplyDelete
  3. As described in the paper you refer to,
    in line 223 you should swap the node at the index smallest node with its parent if the parent node is bigger before doing the TrickleDown_ step.
    Otherwise, you may violate the heap condition.
    By the way, thanks for sharing :)

    ReplyDelete
  4. Thank you very much! I created a test harness for it and it started sputtering once the inputs sizes got a little large. I have fixed it now however and put the code into a GitHub repo. See https://github.com/brownhead/minmaxheap-cpp

    ReplyDelete
  5. Did you ever create or find an implementation of the "stunted priority queue"?

    ReplyDelete
  6. No, sorry :(. It should not be hard to create one based off this implementation, but you should probably base it off of a Interval Heap implementation instead.

    ReplyDelete
  7. How can fix code which has root to be min level?

    ReplyDelete
  8. Did you ever find an implementation of the interval heap in c++? Would love to see it, but this is also great, thanks for posting it.

    ReplyDelete
  9. John, your creation is not a Min-max heap. Try a basic example like:
    https://upload.wikimedia.org/wikipedia/commons/5/50/Min-max_heap.jpg

    ReplyDelete
    Replies
    1. if its not a min-max heap what is it? I have gone through his code and it looks like min-max heap to me.

      Delete
  10. Please implement this code without using these shitty libraries, cuz that was my Assignment...

    ReplyDelete