Wednesday, December 29, 2010

C++ Min-Max Heap Implementation

Update: See the GitHub Repo for the latest version of the code!

I wasted a decent amount of time yesterday...

I've recently come across the need for a double ended priority queue (where you can pop both the min and the max efficiently) and I have been fervently searching for the best way to go about implementing one. My initial searches brought me to a paper written in October of 1986, which described a novel data structure called a Min-Max Heap.

Given a set S of values, a min-max heap on S is a binary tree T with the following properties:
  1. T has the heap-shape
  2. T is min-max ordered: values stored at nodes on even (odd) levels are smaller (greater) than or equal to the values stored at their descendants (if any) where the root is at level zero. Thus, the smallest value of S is stored at the root of T, whereas the largest value is stored at one of the root’s children; an example of a min-max heap is shown in Figure 1 (p. 998).

Below is a diagram I created to illustrate a Min-Max Heap. You may notice that the ordering has been reversed, namely the even levels are max levels and the odd levels are min levels.

Figuring that a Min-Max Heap was perfect for my needs, I went ahead and implemented the data structure. However, after I finished, another data structure was brought to my attention: an Interval Heap. I have run across Interval Heaps before in my travels but only now do I recall that they too have the properties I want. I have also found a number of articles telling me that Interval Heaps are the best and Min-Max Heaps are the worst (as far as real time performance goes).

So now I need to create an implementation of an Interval Heap, not to mention I have this grand Min-Max Heap implementation that will surely never find use. I figure I should put it up on the Internet in the hopes that someone will find use for it though...

What exactly does a Min-Max Heap offer a programmer? It allows constant time lookup of its min and max values; logarithmic time insertion; logarithmic time deletion of min and max values; and linear time creation. Really quite a powerful structure. It is known to be about twice as slow as an interval heap in practice though (one article postulated that this was due to the higher likelihood of cache misses).

The reason I needed a structure with these properties was that I wanted to create a Stunted Priority Queue (a term I just now coined, not in any way an accepted term), or a Priority Queue that would allow me to set a maximum size, and it would never exceed that size. If an insertion made the queue exceed that size, the smallest value would be popped off to compensate.

I have created a very solid implementation of a Min-Max Heap, however, I have not included support for linear time creation. The only way to build the heap is to use the push function, which I believe gives you O(n ln(n)) time. If you would like support for that you will have to add it yourself, though if you do, please send me the modified code so I can update this site.

As always, please email me or post here if you find any problems with the code, or if any part of it is confusing so I may fix it.

I'd also like to hear comments on some of my implementation choices. Such as using a zero-indexed heap rather than a one-indexed heap (which would require an extra sizeof(T) memory). Or using template parameters rather than function parameters in my trickleUp() and trickleDown() functions.

Thursday, December 9, 2010

Traditional Inclusion Guards vs. #pragma once

Out with the old, in with the "new"...

My professors seem to like traditional inclusion guards as they never even mention the fact that #pragma once exists. I find this to be odd as #pragma once support has quickly ingrained itself into major compilers, including g++, and non-portability is the only reason I could think of for why one wouldn't use it.

Why is #pragma once so great? Take a gander at this wikipedia article. Also, consider the source tree below.

foo/foo.hpp
#ifndef FOO_HPP #define FOO_HPP double func() { return 1.2; } #endif
bar/foo.hpp
#ifndef FOO_HPP #define FOO_HPP double func2() { return 4.3; } #endif
main.cpp
#include <iostream> #include "foo/foo.hpp" #include "bar/foo.hpp" int main() { std::cout << func1(); std::cout << func2(); // Error: func2 is undefined }

Not all that pleasant right? Probably should have used #pragma once instead...