# Friday, May 28, 2004

I just came back from Microsoft, Redmond where I attented interviews for SDE/T in the CLR team. (Interviews at Microsoft is a unique experience on its own). It looks like it worked because I'm moving in October to Redmond. :) I would like to thank Michael Corning, Harry Robinson and Holly Barbacovi for their support on this adventure.

Me and Michael Corning at Building 44 in Redmond.

Don't be fooled by the bad quality of the photo, it was pooring rain (the famous Redmond weather).

 

posted on Friday, May 28, 2004 1:25:00 AM (Pacific Daylight Time, UTC-07:00)  #    Comments [9]
# Saturday, May 22, 2004

I've been recently interrested into Mutation Testing, a funny way of measure the quality of tests. This blog presents the first snapshot of a toy application that mutates any .Net program.

Mutation testing is the action of inserting "articifial" faults into the Instance Under Test and look if the tests catch this fault. The idea is that the tests are adequate if they detect all the faults. For example, a typical mutation is to negate the condition expression in a if statement:

//original
if (condition)
   DoSomething();
// mutated
if (!condition)
   DoSomething();

Jester implements Mutation testing for JUnit (there is a nice article here about Jester). In his Thesis (that Lutz Roeder kindly pointed out to me), A Multation Testing Tool for Java Programs, Matthias Bybro defines an entire framework for generating and executing mutants. In this blog, I will not focus on the theory of mutation testing but I'll show how you can get it implemented in .NET

The tools we need

As usual, before attacking the problem we can review what functionalities we need and what we have on our tool set. In this case, we need the ability to load an assembly, explore and alter the IL, and execute or write the mutated assembly. Got any idea....

RAIL! Runtime Assembly Instrumentation Library, that's exactly what we need. With RAIL, you can load an assembly, explore and alter the IL and execute or write the mutated assembly. You can even substitute types or entire functions. In fact, the powerpoint presentation of RAIL, the author shows how to play with IL.

Let's code

The AssemblyScrambler application is designed as follows: a ScramblerEngine instance contains a collection of IScrambler instances. An IScrambler instance contains a method to scramble IL code (I started this application before knowing about mutation testing. So Scrambler should be named mutators, etc...):

public interface IScrambler
{
    void Scramble(ScrambleTrace trace,RMethodDef method);
}

where trace is used to log mutations, and method is an instance of Rail.Reflect.RMethodDef which represents a method. The scramblers are used as follows in ScramblerEngine:

public void Scramble(string fileName)
{
    this.assembly = RAssemblyDef.LoadAssembly(fileName);
    foreach(RTypeDef t in this.assembly.RModuleDef.GetTypes())
    {
        foreach(RMethodDef method in t.GetMethods())
        {
            foreach(IScrambler scrambler in this.Scramblers)
            {
                scrambler.Scramble(trace,method);
            }
        }
    }
}

We are now ready to start implementing scramblers. There is currently only one implemented that swithes brtrue -> brfalse and brfalse -> brtrue. RMethodDef contains a MethodBody that contains a Code instance. Code is a mutable collection of instructions:

for(int i = 0;i<method.MethodBody.Code.InstructionCount;++i)
{
    Instruction il = method.MethodBody.Code[i];
    // selecting instruction
    if (il.OpCode.OperandType != OperandType.InlineBrTarget 
    && il.OpCode.OperandType != OperandType.ShortInlineBrTarget)
        continue;
    // il is ILBranch
    ILBranch branch = (ILBranch)il;
    // check if is brfalse
    if (il.OpCode.Name == OpCodes.Brfalse.Name)
    {
        // subtitute with brtrue
        method.MethodBody.Code[i]=new ILBranch(OpCodes.Brtrue,branch.Target);
    }
    else ...

Switching brtrue and brfalse is as simple as that. Note that here, 99% percent of the work is done by the excellent RAIL library.

Small example

Let's apply the scrambler to a small method:

public void IsTrue(bool isTrue)
{
    Console.Write("Expected: {0}, ",isTrue);
    if (isTrue)
        Console.WriteLine("Actual: true");
    else
        Console.WriteLine("Actual: false");
}

The IL code for this method is the following (using Reflector):

.method public hidebysig instance void IsTrue(bool isTrue) cil managed
{
// Code Size: 42 byte(s)
.maxstack 2
L_0000: ldstr "Expected: {0}, "
L_0005: ldarg.1 
L_0006: box bool
L_000b: call void [mscorlib]System.Console::Write(string, object)
L_0010: ldarg.1 
L_0011: brfalse.s L_001f
L_0013: ldstr "Actual: true"
L_0018: call void [mscorlib]System.Console::WriteLine(string)
L_001d: br.s L_0029
L_001f: ldstr "Actual: false"
L_0024: call void [mscorlib]System.Console::WriteLine(string)
L_0029: ret 
}

You can see that instruction at index 0011 is what we target. We have a small console application that calls this method. The code and results are:

Sandbox sandbox = new Sandbox();
sandbox.IsTrue(true);
sandbox.IsTrue(false);
-- output
Expected: True, Actual: true
Expected: False, Actual: false

After mutation

The above method is passed into the AssemblyScrambler machine, the IL code of the mutated application now looks like this:

.method public hidebysig instance void IsTrue(bool isTrue) cil managed
{
// Code Size: 42 byte(s)
.maxstack 3
L_0000: ldstr "Expected: {0}, "
L_0005: ldarg.1 
L_0006: box bool
L_000b: call void [mscorlib]System.Console::Write(string, object)
L_0010: ldarg.1 
L_0011: brtrue.s L_001f
L_0013: ldstr "Actual: true"
L_0018: call void [mscorlib]System.Console::WriteLine(string)
L_001d: br.s L_0029
L_001f: ldstr "Actual: false"
L_0024: call void [mscorlib]System.Console::WriteLine(string)
L_0029: ret 
}

Take a look now at L_0011, it is now brtrue.s.... the method is mutated. In fact, the output of the snippet gives:

Expected: True, Actual: false
Expected: False, Actual: true

You can download the source at http://www.dotnetwiki.org/DesktopDefault.aspx?tabid=121. Don't forget that you need the RAIL assemblies.
posted on Saturday, May 22, 2004 12:25:00 PM (Pacific Daylight Time, UTC-07:00)  #    Comments [5]
# Friday, May 21, 2004

In the post Fun with Graphs (3): Creating the graph of a database structure, I have presented a small application that creates the graph of a database. We are now going to improve the output by adding the different fields, primary keys, etc.. in the graph.

GraphvizRecordCell

Graphviz supports a type of vertex shape that is drawed as nested tables. This shape is called Record. NGraphviz comes with a class wrapper (GraphvizRecordCell) that lets you easily create such records. Some remarks on cells:

  • A GraphvizRecordCell can also contain other nested cells,
  • By default, Graphviz starts to arrange the cells horizontally and swith direction (vertical/horizontal) at each level 

Let's take the formatVertex event handler and adapt it to create records:

private void formatVertex(Object sender, FormatVertexEventArgs e)
{
    TableSchemaVertex v = (TableSchemaVertex)e.Vertex;
    GraphvizRecord record = new GraphvizRecord();
    e.VertexFormatter.Shape = GraphvizVertexShape.Record;
    e.VertexFormatter.Record = record;
    GraphvizRecordCell table = new GraphvizRecordCell();
    record.Cells.Add(table);

    GraphvizRecordCell name = new GraphvizRecordCell();
    name.Text = v.Table.Name;
    table.Cells.Add(name);
    ...

Here's a sample result on the MbUnit database:

posted on Friday, May 21, 2004 12:56:00 PM (Pacific Daylight Time, UTC-07:00)  #    Comments [1]

Over the last few days, I have started to prepare MbUnit to support loading of test assemblies into separate domain. This feature is very important for a number of reasons:

  • test assemblies are shadow copied,
  • test assemblies can be unloaded. This means that MbUnit can detect when you have recompile the test assembly and reload it.The assembly unloading feature is very important if you plan to do Test Driven Development (test, code, test, code...).
  • it is easier to control the AssemblyResolve event,

Of course, executing the tests in separate AppDomain has a big drawback: test results and notifications is transmitted by Remoting, and this cost cpu cycles. Currently there is a big performance hit (twice slower) for using separate AppDomain. A possible explanation is that there too much event notification that need to cross Remoting channel.

To be continued...

posted on Friday, May 21, 2004 7:02:00 AM (Pacific Daylight Time, UTC-07:00)  #    Comments [4]
# Thursday, May 20, 2004

Sorting the fixture using the namespace/type is nice... but like always, there situations when you would like to sort fixture using other criteras. For example, you might want to sort the test by authors, categories, importance, etc...

While preparing for AddDomain remoting, I have totally refactored the way MbUnit populates the tree to make it totally extensible: now you can populate the anyway you like!

FixtureCategoryAttribute

This is a new attribute that can tag fixture to sort them by categories. You can describe a nested category by separting the names by dots (like a namespace) and you can tag a fixture with multiple categories (a single fixture can be part of multiple categories). For example:

[CompositeFixture(typeof(EnumerableTest))]
[ProviderFactory(typeof(ArrayListFactory),typeof(IEnumerable))]
[ProviderFactory(typeof(HashtableFactory),typeof(IEnumerable))]
[Pelikhan] -> author
[FixtureCategory("Important.Tests.Should.Be.Here")] -> categories
[FixtureCategory("A.Test.Can.Be.In.Multiple.Categories")]
[FixtureCategory("A.Test.Can.Be.In.Multiple.Categories2")]
public class CompositeTest
{
}

Screenshot

Here's a snapshot of the latest MbUnit snapshot: as you can see the tests are sorted by namespace, authors and categories.

posted on Thursday, May 20, 2004 10:34:00 AM (Pacific Daylight Time, UTC-07:00)  #    Comments [3]
# Wednesday, May 19, 2004

This article will try to give an rough overview of the MbUnit vision of tests, and consequently it's architecture. It's contains some material of a previous CodeProject article.

Why MbUnit ?

Unit testing is a great tool for ensuring an application quality and frameworks like NUnit or csUnit have made it very simple to implement. However, as the number of tests begins to grow, the need for more functionalities begin to show up. The above frameworks are based on the Simple Test Pattern which is basically the sequence of SetUp, Test, TearDown actions. Although highly generic, this solution lets a lot of work to be done by the test writer. Sadely, there is no easy way to derive and include a new "fixture" type in those frameworks.

MbUnit is simply born from the fact that I wanted a new fixture and integrating it into existing frameworks was nearly impossible (I was also resting from a knee surgery at hospital with nothing else to do than coding).

Illustrating example

In order to make things clear, I will refer to an example while explaining how MbUnit works. Let me consider the Simple Test Pattern which is implemented by most test unit framework available. This is the classic way of writing unit test as described in the figure below. A TestFixture attribute tags the test class, one SetUp method, tests are done in the Test tagged method and clean up is performed in TearDown tagged method. This is illustrated in the left of the figure.

Attribute -> Run -> Invoker

The kernel of MbUnit is  composed of different components that work in a serial way. The first component is the fixture attribute

The fixture attribute is used to tag the classes that contain unit tests (TestFixtureAttribute is a fixture attribute). The new thing in MbUnit is that each fixture attribute contains the execution logic of the fixture which is returned at run-time under the form of a Run (IRun interface). In the case of the example, the TestFixtureAttribute is defined as a sequence of SetUp, Test and TearDown:

public class TestFixtureAttribute : TestFixturePatternAttribute 
{
     public override IRun GetRun()
     {
          SequenceRun runs = new SequenceRun();
            
          // setup
          OptionalMethodRun setup = new
                              OptionalMethodRun(typeof(SetUpAttribute),false);
          runs.Runs.Add( setup );
            
          //tests
          MethodRun test =new MethodRun(typeof(TestPatternAttribute),true,true);
          runs.Runs.Add(test);
            
          // tear down
          OptionalMethodRun tearDown = new
                           OptionalMethodRun(typeof(TearDownAttribute),false);
          runs.Runs.Add(tearDown);
            
          return runs;                        
     }
}

where

  • TestFixturePatternAttribute is the abstract base class for all new fixture attribute in MbUnit,
  • the GetRun method is called by the MbUnit core to know what is the execution path of the fixture. The fixture can use built-in basic attributes to build it's execution path.
  • An IRun instance can represent the call to a method, or to a sequence of methods, etc...
  • SequenceRun is a sequence of IRun's,
  • MethodRun is a IRun instance that wraps a call to a method tagged by a predefined attribute.
  • OptionalMethodRun is inherited from MethodRun and describes optional methods.

The IRun object will create an execution tree  by exploring the tagged type. Each node of the tree contains a RunInvoker (IRunInvoker interface). The RunInvoker is in charge for calling the method, garding the execptions, loading data, etc... On our sample fixture, there are two tests that the Run will extract:

When the tree is built, we just extract all the possible path from the root node to the leaves to extract the different possible tests. Each of these path is called a Pipe (RunPipe class).

In the GUI, the RunPipe instances are attached to the TreeNode nodes so you can easily select and execute separately the tests. This ensures that the test execution are isolated.

This architecture brings a lot of flexibility (and complexity) on the kind of fixtures that can be defined. Any user can define it's own fixture and use MbUnit to execute it.

posted on Wednesday, May 19, 2004 11:22:00 AM (Pacific Daylight Time, UTC-07:00)  #    Comments [4]

As most of you should know by now, PageRank is the ranking system used by Google to estimate the importance of a page (you can see it in the Google toolbar). Of course, since the basic idea of the algorithm was published they may have been some significant modifications. In this blog, I'll how we can use QuickGraph to compute the PageRank of a graph...

PageRank

The idea behind PageRank is simple and intuitive: pages that are important are referenced by other important pages, page importance is distributed to out-edges. There is an important literature on the web that explains PageRank:

The PageRank is computed by using the following iterative formula:

PR(A) = (1-d) + d (PR(T1)/C(T1) + ... + PR(Tn)/C(Tn)) 

where PR is the PageRank, d is a damping factor usually set to 0.85, C(v) is the number of out edges of v.

PageRank can also be expressed in terms of matrix algebra where it is shown that it is equivalent to finding the eigen values of a sparse matrix (see http://citeseer.ist.psu.edu/kamvar03extrapolation.html). And in fact, the above formula is equivalent to the Power Method (a slow method for finding eigen values).

Where's my LAPACK ?

Until C# (and .Net) has a real and free wrapper around LAPACK, we cannot attack PageRank using matrix algebra. As mentionned above, the formula is equivalent to the Power Method, which is a slow (potentially very slow) method for computing eigen values (convergence rate is |la_1 / la_2| where la_1 is the largest eigen value, la_2 is the second largest eigen value). There are other faster methods (like model reduction) that we could use to speed up things if we had LAPACK (I'm throwing a bottle in the .Net sea here).

Implementation in QuickGraph

Since we have no matrix algebra, the implementation is very basic and unefficient (I'm almost ashamed). This is almost a disclaimer: do not use this algorithm for big graphs, it is potentially slow. The main loop looks like this:

// temporay rank dictionary
VertexDoubleDictionary tempRanks = new VertexDoubleDictionary();
// create filtered graph that removes dangling links
FilteredBidirectionalGraph fg = new FilteredBidirectionalGraph(
    this.VisitedGraph,
    Preds.KeepAllEdges(),
    new InDictionaryVertexPredicate(this.ranks)
    ); 

int iter = 0;
double error = 0;
do
{
    // compute page ranks
    error = 0;
    foreach(DictionaryEntry de in this.Ranks) 
    {
        IVertex v = (IVertex)de.Key;
        double rank = (double)de.Value;

        double r = 0;
        foreach(IEdge e in fg.InEdges(v))
        {
            r += this.ranks[e.Source] / fg.OutDegree(e.Source);
        }
        // add sourceRank and store
        double newRank = (1-this.damping) + this.damping * r;
        tempRanks[v] = newRank;
        // compute deviation
        error += Math.Abs(rank - newRank);
    } 
    // swap ranks
    VertexDoubleDictionary temp = ranks;
    ranks = tempRanks;
    tempRanks = temp; 
    iter++;
// iterate until convergence, or max iteration reached
}while( error > this.tolerance && iter < this.maxIterations);

where ranks is the PR method, damping is the d factor. Note that because we use enumerators, we cannot modify the ranks as we iterate the dictionary, otherwize the enumerator would be invalidated, therefore we use 2 dictionaries and swap between them as we go along.

The results

As usual, we use GraphvizAlgorithm to output the results. Here are some graph sample with each corresponding page rank:

posted on Wednesday, May 19, 2004 9:22:00 AM (Pacific Daylight Time, UTC-07:00)  #    Comments [5]
# Tuesday, May 18, 2004

This is the first episode of an article serie on graphs and databases. The menu of today will be

  1. how to extract the structure of a database,
  2. create a graph representation using QuickGraph
  3. draw it using NGraphviz

To make things user friendly, I will use the PropertyGrid to set up things.

Extracting the database schema

At first sight this seemed to be a tedious and boring task but hopefully a light poped up on the back of my head saying "you have already seen that in CodeSmith". In fact CodeSmith comes with an assembly, SchemaExplorer, whose purpose is to extract database schema. Even better, the main class, DatabaseSchema, comes with a custom type editor (DatabaseSchemaTypeEditor) so that integration in the PropertyGrid is straightforward. 

public class DataGraphProperties
{
    private DatabaseSchema schema = null;
    [Category("Data")]
    [TypeConverter(typeof(DatabaseSchemaTypeConverter))]
    public DatabaseSchema Schema
    {
        get
        {
            return this.schema;
        }
        set
        {
            this.schema = value;
        }
    }
}

In the PropertyGrid, the Schema property will let the user to select a data source.

Database and graphs

It is straightforward to see that a database is a graph where the tables are the vertex and the foreign keys are the edges. The DatabaseSchema class contains the collection of tables (TableSchema instances), each table containing a collection of foreign keys (TableKeySchema instance). So we have all we need to populate the graph.

Custom Vertex and Edges

The first step for creating a representation of the database as a QuickGraph graph is to create the custom vertex (that implements IVertex) and edge classes (that implement IEdge). This task is straigtforward by using two default classes, Vertex and Edge, available in the QuickGraph assembly. This is illustrated for TableSchemaVertex:

public class TableSchemaVertex : Vertex
{
    private TableSchema table = null;
    public TableSchemaVertex(int id)
    :base(id)
    {}

    public TableSchema Table
    {
        get
        {
            if (this.table==null)
                throw new InvalidOperationException("table not initialized");
            return this.table;
        }
        set
        {
            this.table = value;
        }
    }
}

The TableSchemaVertex instance are to be created by a vertex provider:

public class TableSchemaVertexProvider : TypedVertexProvider
{
    public TableSchemaVertexProvider()
    :base(typeof(TableSchemaVertex))
    {}
}

The same thing is done again for the edges, which is called TableKeySchemaEdge.

Custom Graph

The custom graph is generated using the CodeSmith template AdjacencyGraph.cst. The class is called DatabaseSchemaGraph.

Populating the graph

Once the data structure is ready, populating the graph with the tables and the keys is straightforward:

DatabaseSchema schema = ...;
DatabaseSchemaGraph graph = ...;
// add tables;
foreach(TableSchema table in schema.Tables)
{
    graph.AddVertex(table);
}
// foreach table, add all relations (out-edges)
foreach(TableSchema table in schema.Tables)
{
    foreach(TableKeySchema key in table.ForeignKeys)
    {
        graph.AddEdge(key);
    }
}

That's it :)

Let's do some drawing

Now that we have a graph of the database, the Graphviz "machinery" can be used to output a number of different drawings (refer to this post for a detailled tutorial on using Graphviz). Bundled that with the PropertyGrid and we get a nice and simple database grapher. I have applied DbGrapher on the database that MbUnit uses to store test results:

Next episode

In the next episode, we will see how to improve the (poor) quality of the drawing and how to detect cascade cycles (on delete cycles etc...).

posted on Tuesday, May 18, 2004 9:37:00 PM (Pacific Daylight Time, UTC-07:00)  #    Comments [2]
# Monday, May 17, 2004

The Abstract Test Pattern (ATP)

I have received a few comments on my blog entry on Composite Unit Testing (CUT) arguying that this was the Abstract Test Pattern . Here's a snapshot of the definition from the definition form http://c2.com/cgi/:

A Testing Pattern describing a way to reuse test cases for multiple implementations of an Interface.
Problem
How to write a Test Suite against an Interface (or Abstract Class) that can be used to test all implementations of the interface.

Solution

  • Write an AbstractTest  for every Interface and Abstract Class). The AbstractTest should have an abstract FactoryMethod that creates an object with the type of the Interface.
  • Write a ConcreteTest for every implementation of the Interface. The ConcreteTest? should be a descendant of the AbstractTest and override the FactoryMethod to construct an instance of the implementation class.

Functional Compliance

Eric George's article gives a more detailled description of the pattern and describes it as functional compliance. It is easy enough for the compiler to tell whether a class is syntactically compliant with an interface. It applies a check to see if all required methods have been implemented with the correct signatures (syntaxic compliance), but the compiler cannot check functional compliance of a class with its interface. Here's the formal definition given by Eric George:

Functional Compliance is a module's compliance with some documented or published functional specification. The specification can be purely documentational, or it can be partially enforced through Interfaces or Abstract Classes. Interfaces and Abstract Classes along with their associated documentation represent a contract between the implementation code and the client (or user) code. It is this contract that needs to be fully tested. The Liskov Substitution Principle (LSP) tells us that all modules that honor a contract (usually by implementing an interface), should behave the same from the perspective of the client code. A module's functional compliance is really the degree to which it obey's the LSP.

So what about Composite Unit Testing ?

The remarks from the readers were right. Composite Unit Testing is

  • an enhanced form of the Abstract Test Pattern,
  • is a tool to test functional compliance

There is, however, a major difference between ATP and CUT: separation of the test code and the factory methods. In AUT, you create a ConcreteTest that inherits AbstractTest and implements a factory method, so the code that generates the tested entity is "hard-coded" into concrete test. In CUT, the framework takes care of retreiving and feeding you AbstractTest using user-specified factories (you can easily have multiple factories):

// AUT
// abstract method
public abstract class AbstractEnumerableTest
{
    public IEnumerable Create();
    public void GetEnumeratorTest()
    {
        IEnumerable en = this.Create();
        ...
    }
}

// concrete implementation
[TestFixture]
public class ArrayListEnumerableTest
{
    public override IEnumerable Create()
    { return new ArrayList();}
}

The same test as above, using CUT:

// the fixture
public class EnumerableFixture
{
    public void GetEnumeratorTest(IEnumerable en)
    {
        ...
    }
}

// the factories
public class ArrayListFactory
{
    public ArrayList Emtpy
    { get{ return new ArrayList();}}
}

// link the fixture with the factories
[CompositeFixture(typeof(EnumerableFixture), typeof(IEnumerable))]
[ProviderFactory(typeof(ArraListFactory),typeof(IEnumerable))]
public class EnumerableTest
{}
posted on Monday, May 17, 2004 7:42:00 AM (Pacific Daylight Time, UTC-07:00)  #    Comments [3]
# Friday, May 14, 2004

The following article on CodeProject talks about Scarified Treemaps, an interresting tree visualization. I wonder what it would look like in MbUnit...

Demo application - treemaps.png

posted on Friday, May 14, 2004 2:17:00 PM (Pacific Daylight Time, UTC-07:00)  #    Comments [0]