# Monday, September 08, 2008

The other day I stumbled on a first draft on a new book on algorithms (Data Structure and Algorithms). After taking a peak at the draft, I found some (hidden) motivation to finally write a decent binary heap for QuickGraph. A heap is a critical data structure for Dijkstra shortest path or Prim's minimum spanning tree algorithm, since it is used to build efficient priority queues.

In this post, we'll start building a binary heap using Test-Driven Development (write the tests first, etc...) and parameterized unit tests.

BinaryHeap?

The heap is a tree where each parent node has a value smaller or equal to the child nodes. The binary heap is a heap implemented through a binary tree, and to make things more interesting (and fast), the tree is usually mapped to an array using indexing magic:

  • parent node index: (index - 1) /2
  • left child node: 2*index + 1
  • right child node: 2*index + 2

The indexing magic is typically the kind of things that introduce bugs.

Let's write that first test

We start by writing a test that simply fills any binary heap with a number of entries. A possible test for this is written as follows:

[PexMethod]
public void Insert<TPriority, TValue>(
    [PexAssumeUnderTest]BinaryHeap<TPriority, TValue> target,
    [PexAssumeNotNull] KeyValuePair<TPriority, TValue>[] kvs)
{
    var count = target.Count;
    foreach (var kv in kvs) {
        target.Add(kv.Key, kv.Value);
        AssertInvariant<TPriority, TValue>(target);
    }
    Assert.IsTrue(count + kvs.Length == target.Count);
}

There are a number of unusual annotations in this test. Let's review all of them:

  • The test is generic: Pex supports generic unit tests. This is really convenient when testing a generic type.
  • PexAssumeUnderTest, PexAssumeNotNull: it basically tells Pex that we don't care about the case where kvs or target is null.
  • We've added a Add(TPriority priority, TValue value) method and Count property to the BinaryHeap

There are also two assertions in the test. Inlined in the loop, we check the invariant of the heap (AssertInvariant). We'll fill up this method as we go. At the end of the test, we check that the Count property.

BinaryHeap version #0

We start with a simple (broken) implementation that does nothing:

public class BinaryHeap<TPriority, TValue> {
    public void Add(TPriority priority, TValue value) { }
    public int Count { returnt 0; }
}

Now that the code compiles, we run Pex which quickly finds that a non-empty array breaks the count assertion.

image

BinaryHeap version #1

We have a failing test so let's fix the code by storing the items in a list:

public class BinaryHeap<TPriority, TValue> {
    List<KeyValuePair<TPriority, TValue>> items = new List<KeyValuePair<TPriority, TValue>>();
    public void Add(TPriority priority, TValue value) {
        this.items.Add(new KeyValuePair<TPriority, TValue>(priority, value));
    }
    public int Count {
        get { return this.items.Count; }
    }

We run Pex again and all the previous failing tests are now passing :)

image

There's a new object creation event that happened (bold events should be looked at). Remember that the test takes a binary heap as argument, well we probably need to tell Pex how to instantiate that class. In fact, this is exactly what happens when I click on the object creation button:

image

Pex tells me that it guessed how to create the binary heap instance and gives me the opportunity to save and edit the factory. The factory looks like this and get included automatically in the test project:

[PexFactoryClass]
public partial class BinaryHeapFactory {
    [PexFactoryMethod(typeof(BinaryHeap<int, int>))]
    public static BinaryHeap<int, int> Create() {
        BinaryHeap<int, int> binaryHeap = new BinaryHeap<int, int>();
        return binaryHeap;
    }
}

All our tests are passing, we can write the next test... but wait!

We have an invariant!

The nice thing about data structures is that they have fairly well defined invariants. These are very useful for testing!

In the case of the heap, we know that the parent node priority should always be less or equal to the priorities of both of his left and right children. Therefore, we can add a method to BinaryHeap that walks the array and checks this property on each node:

[Conditional("DEBUG")]
public void ObjectInvariant() {
    for (int index = 0; index < this.count; ++index) {
        var left = 2 * index + 1;
        Debug.Assert(left >= count || this.Less(index, left));
        var right = 2 * index + 2;
        Debug.Assert(right >= count || this.Less(index, right));
    }
}
private bool Less(int i, int j) { return false; }

Remember that AssertInvariant method, let's call ObjectInvariant in that method and run Pex again.

void AssertInvariant<TPriority,TValue>(BinaryHeap<TPriority,TValue> target) {
    target.ObjectInvariant();
}

Pex immediately finds an issue:

image

This assertion is due to our overly simplified implementation of Less, which always return false.

Fixing tests, and finding new failing tests

We have failing tests so it's time to fix the code again. Let's start fixing on the Less method by using a comparer:

readonly Comparison<TPriority> comparison =
    Comparer<TPriority>.Default.Compare;
private bool Less(int i, int j)
{
    return this.comparison(this.items[i].Key, this.items[j].Key) >= 0;
}

We run Pex and it comes back with the following tests:

image

Two interesting things happened here:

  • the previous failing test (with [0,0], [0,0]) was fixed by fixing Less
  • Pex found a new issue where the input array involves a small key (3584), then a larger key (4098). A correct heap implementation would have kept the smaller key at the first position.

The coolest part is we did not have to write any additional line of code to get to this point: Pex updated the previous tests and generated the new failure for us.

This is a new kind of flow that occurs when using Pex in TDD: a code update has fixed some issues but created new ones. We are still moving towards our goal but we did not have to pay the price of writing a new unit test.

In fact, to fulfill the invariant and make this test pass we will have to write a correct implementation of the Add method.... without writing a single additional line of test code :)

to be continued...