QuickGraph is now a portable library. Get it while it is hot at http://quickgraph.codeplex.com.

http://rise4fun.com/ is a web front end to a number of tools produced by the RiSE group. It also exposes** ****REST services** that allow you to play with the tools from your favorite environment.

**Rendering DOT graphs to SVG**

DOT is a popular language to describe graphs. It can be rendered into SVG using the MSAGL tool on http://rise4fun.com . To do so, you simply need to do a **POST** query to http://rise4fun.com/services.svc/ask/agl where the dot code is passed in the request body. rise4fun returns SVG markup that can be viewed in browsers that support it.

Wondering what graph SVG look like? Check out http://rise4fun.com/agl/cilreader to see this beautiful graph. Make sure to zoom out as the graph is rather laaaaaarge.

Cheers, Peli

Render beautiful dot graphs to SVG at http://rise4fun.com/agl – yes you can render any graph there. If your graph is big enough, it will trigger the new **edge bundling feature** that looks AMAZING! Try loading this url with modern browser with SVG support…. click, zoom out and wait for the magic.

I just uploaded a refresh of the QuickGraph binaries with Code Contracts references and documentation instrumented with the Contracts.

While on parental leave, I could sneak a couple hours to read a paper on Frontier Search by Korf and al. Frontier search is a kind of best-first search algorithm that reduces the needs for memory. A very interesting and relaxing read… I add an implementation to QuickGraph that follows the idea.

I’ve just made a release of QuickGraph 3.3 on CodePlex. Along with the usual bug fixes, this version brings a set of **new graph data structures based on delegates**. Instead of taking care of the storage, these graphs simply ‘query’ delegates for vertices and out-edges. You can use them to bridge any existing graph data structure in your code with QuickGraph – with almost no performance/memory penalty. (Delegate graphs can also be used to deal with graphs that won’t hold in memory but that’s another story).

Let’s see this with an example. Suppose, we have a simple graph representation using a jagged array, *int[][]* *graph*, where each edge is defined as *(i, graph[i][j])* :

In order to use this graph with QuickGraph, **we ‘wrap’ it up into a delegate graph**. There are a number of extension methods in GraphExtensions that take care of the common cases. Delegate graphs rely on 2 delegates: one that returns the vertices, the other one out-edges:

At this point, we can apply any of the algorithms that QuickGraph provides. For example, computing a topological sort of the vertices:

The new delegate graphs are ready to be downloaded at http://quickgraph.codeplex.com/Release/ProjectReleases.aspx .

Happy programming!

I just released a new version of **QuickGraph for Silverlight** on http://quickgraph.codeplex.com . You can grad it to run your favorite graph algorithm in the silverlight browser!

Ealier this week, we released the Code Contracts library for .NET. Since then, I’ve implemented a lot for contracts for the data structures in QuickGraph . In this post, I’ll talk about my experience with the contracts…

**Walking through some contracts.**

Let’s take a look at the contracts of AddEdge, a method that adds an edge to a graph. Adding an edge to a graph is certainly a fun adventure, when you write contracts for it. Let’s take a look:

public interface IMutableEdgeListGraph<TVertex, TEdge> : ... where TEdge : IEdge<TVertex> { /// <summary> /// Adds the edge to the graph /// </summary> /// <param name="edge"></param> /// <returns>true if the edge was added, otherwise false.</returns> bool AddEdge(TEdge edge);

**Interface contracts**

Since IMutableEdgeListGraph is an interface, we need to store the contracts in separate class. To do so, we ‘bind’ the interface and the contract class to each other using the ContractClassForAttribute/ContractClassAttribute.

[ContractClass(typeof(IMutableEdgeListGraphContract<,>))] public interface IMutableEdgeListGraph<TVertex, TEdge> ...

[ContractClassFor(typeof(IMutableEdgeListGraph<,>))] sealed class IMutableEdgeListGraphContract<TVertex, TEdge> : IMutableEdgeListGraph<TVertex, TEdge> where TEdge : IEdge<TVertex>

Once both types are bound, you can start implementing the contracts of the interface in the contract class. Note that the contract class must use *explicit* interface implementations for all methods.

**Basic null checks**

If you care about null references, the first contract will probably to ensure the edge is not null. To make it quick and painless, make sure you use the ** crn** snippet. Note that since the method body of the contracts does not matter, we simply return the default value.

bool IMutableEdgeListGraph<TVertex, TEdge>.AddEdge(TEdge e) {Contract.Requires(e != null); return default(bool);}

**More pre-conditions**

One of the *implicit* requirement of AddEdge is that both vertices should already belong to the graph. We want to make this *explicit* as a pre-condition as well:

bool IMutableEdgeListGraph<TVertex, TEdge>.AddEdge(TEdge e) { IMutableEdgeListGraph<TVertex, TEdge> ithis = this; Contract.Requires(e != null);Contract.Requires(ithis.ContainsVertex(e.Source)); Contract.Requires(ithis.ContainsVertex(e.Target));

There are two things to notice here: (1) we had to cast the “this” pointer to the interface we are writing contracts for, IMutableEdgeListGraph<,>. Because the methods in a contract class must be explicit interface implementations, we do not have access to the members of this interface from the “this” pointer, (2) ContainsVertex had to be annotated with [Pure], as any method called from a contract must be pure:

[Pure] bool ContainsVertex(TVertex vertex);

**What about post-conditions**

One of my favorite feature of Code Contracts is to be able to state post-conditions. This is done by calling the Contracts.Ensures method. For example, we start by expressing that the edge must belong to the graph when we leave AddEdge:

bool IMutableEdgeListGraph<TVertex, TEdge>.AddEdge(TEdge e) { IMutableEdgeListGraph<TVertex, TEdge> ithis = this;

...

Contract.Ensures(ithis.ContainsEdge(e));

**Result and Old**

Since the method returns a boolean, we should also state something about the result value. To refer to the result, one has to use the Contract.Result<T>() method. In this case, the method returns true if the edge was new, false if it was already in the graph. We can refer to a pre-state, i.e. the value of this.Contains(e) at the beginning of the method***:

Contract.Ensures(Contract.Result<bool>()

== Contract.OldValue(!ithis.ContainsEdge(e)));

Lastly, we can also make sure the edge count has been incremented, if an edge was actually added (as you can see, we could use an implication operator in C#):

Contract.Ensures(ithis.EdgeCount

== Contract.OldValue(ithis.EdgeCount) + (Contract.Result<bool>() ? 1 : 0));

*** The OldState value is evaluated after the preconditions.

**Where to go next?**

In the next post, I’ll show how to leverage these Contracts with Pex…

Following the implementation of the disjointset, another fun algorithm to implement was the offline least common ancestor problem. Tarjan proposed a algorithm based on the disjoint-set and a DFS traversal. Given that I now have all those ingredients in QuickGraph, it was just a matter of hooking together the implementation with the right events. The beef of the implementation looks as follows (and looks quite elegant to me):

var gpair = GraphExtensions.ToAdjacencyGraph(this.pairs); // for fast lookup var disjointSet = new ForestDisjointSet(); var vancestors = new Dictionary (); var dfs = new DepthFirstSearchAlgorithm (this, this.VisitedGraph, new Dictionary GraphColor>(this.VisitedGraph.VertexCount)); dfs.InitializeVertex += (sender, args) => disjointSet.MakeSet(args.Vertex); dfs.DiscoverVertex += (sender, args) => vancestors[args.Vertex] = args.Vertex; dfs.TreeEdge += (sender, args) => { var edge = args.Edge; disjointSet.Union(edge.Source, edge.Target); vancestors[disjointSet.FindSet(edge.Source)] = edge.Source; }; dfs.FinishVertex += (sender, args) => { foreach (var e in gpair.OutEdges(args.Vertex)) if (dfs.VertexColors[e.Target] == GraphColor.Black) this.ancestors[EdgeExtensions.ToVertexPair(e)] = vancestors[disjointSet.FindSet(e.Target)]; }; // go! dfs.Compute(root);

Enjoy!

The **k-shortest paths** computes the *k* first shortest path between two vertices in a directed graph. If you put this in the context of route planning, it gives you *k* alternative routes in case the shortest path is blocked by snow :) While the single source shortest path is well-known and implemented in many languages, there are not many implementations available for this problem, although it has been extensively studied. After looking around, I stumbled on a nice article comparing various approaches. The authors pointed out an algorithm from 1959 from Hoffman and Pavley that solved this problem (there are actually many others). This algorithm looked like a good fit:

- it requires a single call to a single-source shortest path algorithm. Other approaches will require as much as
*kn*calls to a shortest path algorithm. - it sounded simple and did not require new specialized data structures.

**I want to try it!**

The algorithm is available in QuickGraph 3.1.40104.00 and up. You can take a look at it at http://www.codeplex.com/quickgraph/SourceControl/changeset/view/29858#377982. To use it on a *BidirectionalGraph*,

IBidirectionalGraph<TVertex, TEdge> g = …;

foreach(IEnumerable<TEdge> path in g.RankedShortestPathHoffmanPavley(weights, source, goal, 4))

…

**A glimpse at the Hoffmann-Pavley algorithm**

The algorithm works in 2 phases.

On the first phase, we build a minimum ‘successor’ tree towards the goal vertex. This tree can be used to build a shortest path from any (reachable) vertex to the goal vertex. To build this tree, we can simply apply the Dijkstra shortest path algorithm on the reversed graph. This can be done in a couple lines with QuickGraph.

As for many other *k*-shortest path algorithm, phase works by building deviation path and picking the best one. In the case of the Hoffman-Pavley algorithm, it works as follows: pick the latest shortest path, for each vertex of this path, build a deviation path (more later) for each out-edge and add it to a priority queue. Then start again:

var queue = new PriorityQueue<Path>();

queue.Enqueue(shortestpath);

while(queue.Count > 0) {

var path = queue.Dequeue();

foreach(var vertex in path)

foreach(var edge in graph.OutEdges(vertex))

queue.Enqueue(CreateDeviation(path, vertex, edge));

}

A deviation path is composed of three parts:

- the initial part of the ‘seeding’ path, i.e. the edges before the deviation edge,
- the deviation edge
- the remaining shortest path to the goal. When we build the deviation path, we also compute it’s weight. A nice property of the deviation paths is that they can be ‘built’ when needed. This saves a lot of space and computation as most deviation paths will probably not end up in the winning set of paths – instead of storing a full path, we store a path index and an edge index.

That’s it! The details contain more code to deal with self-edges and loops but the main idea is there. This is definitely a very elegant algorithm!

**What about testing!**

Algorithm authors usually illustrate their approach with an example. This is a good way to get started on a small graph example and ensure that the algorithm works as the original author expected. This is the kind of unit test I get started with.

The next step is to apply the algorithm to a large number of graph instances. Unfortunately, I do not have any other k-shortest path algorithm, so the oracle is harder to build here. Nevertheless, the result of the algorithm, i.e. the shortest path collection, has a couple of properties that should always be true:

- the algorithm produces loopless paths,
- path
*k*is lower or equal to path*k*+ 1.

The problem with this test is that it does not guarantee that some shortest path have been missed. At this point, I’m a bit puzzled on how to test that.

When you implement an algorithm that computes an optimal solution (i.e. minimum/maximum with respect to a cost function), how do you test that the solution is actually optimal?

This is the kind of question I face when implementing algorithms in QuickGraph. For example, I was recently looking at the minimum spanning tree of a graph (MST)? While checking that the result tree is a spanning tree is easy, checking that it is *minimum* is not obvious: nobody has written *Assert.IsMinimum* yet :). Here are a couple techniques that I found useful along the way:

**Input-Output table**

The most obvious approach is to pre-compute the result for a number of problems and assert the solution of the algorithm matches. In this MST case, use a small set of graphs for which the MST is well known, and check that the computed MST has the correct weight. This approach will take you only so far and requires a lot of manual work since you need to solve the problem (or find a known solution) for a number of representative cases.

**Solution Perturbation**

If the algorithm computes a solution that is minimal with respect to a cost function, one can try to perturbate the solution to see if there’s a smaller one. If so, it clearly violates the fact that the solution should be minimal, thus you just found a bug. In the case of MST, this would mean randomly picking edges that are not in the minimum spanning tree, swapping them and evaluate if the result tree is smaller.

This is kind of approach is actually used in optimization where the search space might have local minima (see simulated annealing).

**Multiple Implementations**

A nice thing about graph problems is that there are usually many (vastly) different ways to solve them. If multiple implementations are available, we can simply compare the result of each algorithm against each other and make sure that they match each other. Each algorithm might have bugs but it is unlikely that they share common ones. Since we have now a good oracle, we can apply this approach that a large number of inputs to increase our coverage.

In the MST case, two popular algorithms are Prim’s and Kruskal’s. Those are 2 different approaches: Prim is built on top of Dijkstra single source shortest path, while Kruskal is built on top of the disjoint-set data structure. By carefully picking the weights of the edges, we can assert different things:

- if the edge weights are all equals, any spanning tree is minimal. So we can compare the result to a depth-first-search algorithm (which can easily compute a spanning tree).
- if some edge weights are different, there may be many minimum spanning tree. In this case, we can still assert that weight of the tree is minimum.
- if all the edge weights are different, then the MST is unique. This fact can be used to precisely pinpoint the differences in solution during testing.

There is a corner case that needs to be checked: if all algorithm are no-op, i.e. they don’t do anything. There solution will always match!

Happy New Year!

I recently implemented a Disjoint-Set data structure in QuickGraph, which is the main building block of the Kruskal’s minimum spanning tree algorithm. This is a fun data-structure to look at, as it purpose is quite different from the ‘main’ BCL collections. The disjoint-set is useful to partition elements into sets, and defines 2 main operation to that purpose: *Find*, finds the set an element belongs to (can be used to check if 2 elements are in the same set), *Union* merges two sets.

The source of the disjoint-set is in the QuickGraph source, if you are curious about the details.

**Testing the disjoint-set**

There is not much left to TDD when it comes to write data structure. Such data structure are usually described in details in an article, and you ‘just’ have to follow the authors instruction to implement. Nevertheless, unit testing is really critical to ensure that the implementation is correct – but there is a risk of having to re-implement the algorithm to test it.

For the disjoint-set, I used 2 tools developed at RiSE: Code Contracts and… Pex. When implementing data structure from the literature, I usually start ‘dumping’ the code and the contracts. As much as possible, any invariant or property that the author describes should be translated to contracts. It will give more opportunities for Pex to find bugs in my code.

**Contracts First**

For example, the contracts for the **Union** method look as follows:

private bool Union(Element left, Element right) { Contract.Requires(left != null); Contract.Requires(right != null); Contract.Ensures(Contract.Result<bool>() ? Contract.OldValue(this.SetCount) - 1 == this.SetCount : Contract.OldValue(this.SetCount) == this.SetCount ); Contract.Ensures(this.FindNoCompression(left) == this.FindNoCompression(right));

The two first requires clauses do the usual null checks. The first ensures clause checks that if *Union* returns true, a merge has been done and the number of sets has decreased by 1. The last ensures checks that left and right belong to the same set at the end of the union.

Note that so far, I did not have to provide any kind of implementation details. The other methods in the implementation receive the same treatment.

**A Parameterized Unit Test**

I wrote a single parameterized unit test while writing/debugging the implementation of the forest. It could probably have been re-factored into many smaller tests, but for the sake of laziness, I used a single one.

The parameterized unit test implement a common scenario: add elements to the disjoint-set, then apply a bunch of union operations. Along the way, we can rely on test assertion and code contracts to check the correctness of our implementation.

Let’s start with the test method signature:

[PexMethod] public void Unions(int elementCount, [PexAssumeNotNull]KeyValuePair<int,int>[] unions) {

The test takes a number of element to add and a sequence of unions to apply. The input data as is needs to be refined to be useful. In that sense, we add **assumptions**, under the form of calls to *PexAssume*, to tell Pex ‘how it should shape’ the input data. In this case, we want to ensure that *elementCount* is positive and relatively small; and that the values in unions are within the *[0…elementCount)* range.

PexAssume.IsTrue(0 < elementCount); PexSymbolicValue.Minimize(elementCount); PexAssume.TrueForAll( unions, u => 0 <= u.Key && u.Key < elementCount && 0 <= u.Value && u.Value < elementCount );

Now that we have some data, we can start writing the first part of the scenario: filling the disjoint-set. To do so, we simply add the integers from *[0..elementCount)*. Along the way, we check that the *Contains,* *ElementCount*, *SetCount* all behave as expected:

var target = new ForestDisjointSet<int>(); // fill up with 0..elementCount - 1 for (int i = 0; i < elementCount; i++) { target.MakeSet(i); Assert.IsTrue(target.Contains(i)); Assert.AreEqual(i + 1, target.ElementCount); Assert.AreEqual(i + 1, target.SetCount); }

The second part gets more interesting. Each element of the unions array is a ‘union’ action between 2 elements:

// apply Union for pairs unions[i], unions[i+1] for (int i = 0; i < unions.Length; i++) { var left = unions[i].Key; var right= unions[i].Value; var setCount = target.SetCount; bool unioned = target.Union(left, right);

There is 2 things we want to assert here: first, then left and right now belong to the same set. Secondly, that the *SetCount* has been updated correctly: if *unioned* is true, it should have decreased by one.

// should be in the same set now

Assert.IsTrue(target.AreInSameSet(left, right));

// if unioned, the count decreased by 1

PexAssert.ImpliesIsTrue(unioned, () => setCount - 1 == target.SetCount);

}

}

From this parameterized unit test, I could work on the implementation, and refine again and again until it had all passing tests (I did not write **any** other test code)**.**

**Happy Ending**

The resulting test suite generated by Pex is summarized in the screenshot below: the number of elements does not really matter, what is interesting is the sequence of unions performed. This test suite achieves 100% coverage of the methods under test :).

In fact, the *Union* method involved some tricky branches to cover, due to some optimization occurring in the disjoint-set. Pex managed to generate unit tests for each of the branches.

**Thanks to the contracts, the test assertions, and the high code coverage of the generated test suite, I have now a good confidence that my code is properly tested. **The End!

(Next time, we will talk about testing minimum spanning tree implementations…)

I just released a new version of QuickGraph. Among many small improvements here and there, the big news are:

- .net 2.0 support is back! QuickGraph now builds both for .net 2.0 and .net 3.5 (with extension methods).
- The Dijkstra algorithm use a Fibonacci Heap under the hood, which is way more efficient. Courtesy of Roy Williams!
- better serialization support and other bug fixes.

Enjoy, Peli.

I’ve upgraded QuickGraph to C# 3.0, and it’s extension methods and lambdas **(requires .net 3.5)**. The result is fairly nice, specially the extension methods who make algorithms more discoverable.

For example, the following snippet shows how to get the topological sort of tables out of a DataSet (useful when you need to fill/delete tables):

DateSet ds = …; // get some DataSet

// ToGraph() is an extension method on DataSet, which builds

// the Table/Relation graph from the data set schema

var g = ds.ToGraph();

// TopologicalSort() is an extension method on IVertexAndEdgeListGraph<T,E>

// which returns the graph topological sort

var tables = g.TopologicalSort();

foreach(DataTable table in tables)

Console.WriteLine(table.TableName);

More changes were made on the API to take advantage of delegates (i.e. since it’s easier to write them): no more predicate interfaces, dropping dictionaries for delegates in shortest path algorithm etc…