# Thursday, January 15, 2009

Phil Haacked has started an interresting discussion on the implementation of a named formatter whose format strings are of the form {name} instead of {0}. In fact, he started with 3 implementations and his reader submitted 2 more! Since Phil was kind enough to package all the implementations (and the unit tests) in a solution, I took the liberty to run Pex on them and see if there was any issue lurking out. The goal of this post is not really to show that those implementation are correct or not, but give you an idea where Pex could be applicable in your code.

But wait, it’s already Unit Tested!

Indeed, Phil diligently wrote a unit test suite that covers many different use cases. Does it mean you are done with testing? The named format syntax can be tricky since may involve escaping curly braces… What is the named format syntax anyway? It’s pretty obvious that it should let you write something like “{Name} is {Status}” instead of “{0} is {1}” but that still pretty vague. In particular, what are the syntactic rules for escaping curly braces (what’s the meaning {{}{}{{}}}, etc…) or is there even a grammar describing named formats.

As it is often the case, there is no clear and definitive answer – at least I could not find it –. In this post, I’ll show 2 techniques that can be used here out of the many patterns for Pex which we documented.

Technique 1: Explore for runtime errors (pattern 2.12 – parameterized stub)

The most basic way of running Pex is to simply apply Pex on a method without adding any assertions. At this point, you are looking for NullReferenceException, IndexOutOfRanceException or other violations of lower level APIs. Although this kind of testing won’t help you answer whether your code is correct, it may give you back instances where it is wrong. For example, by passing any format and object into the format method in a parameterized unit test; the code below is a parameterized unit test for which Pex can generate relevant inputs:

public void HenriFormat(string format, object o)

Note that it took a couple runs of Pex to  figure the right set of assemblies to be instrumented. Hopefully, this process is mostly automated and you simply have to apply the changes that Pex suggests.

The screenshot below shows the input generated by Pex. Intuitively, I would think that FormatException is expected and part of properly rejecting invalid inputs. However, there are ArgumentOutOfRanceExceptions triggered inside of the DataBinder code that probably should intercepted earlier. If this implementation uses the DataBinder code, it should make sure that it only gets acceptable values. Whether those issues are worth fixing is a tricky question: maybe the programmer intended to let this behavior happen.


Technique 2: Compare different implementations (pattern 2.7 – same observable behavior)

A second approach is to compare the output of the different implementations. Since all the implementations implement the same (underspecified) named format specification, their behavior should match with respect to any input. This property makes it really easy to write really powerful parameterized unit tests. For example, the following test asserts that the Haacked and Hanselman implementation have the same observable behavior, i.e. return the same string or throw the same exception type (PexAssert.AreBehaviorEqual takes care of asserting this):

public void HaackVsHansel(string format, object o)
        () => format.HaackFormat(o),
        () => format.HanselFormat(o)

Again, this kind of testing will not help to answer if the code is correct but it will give you instance where the different implementations behave differently, which means one of the two (or even both) have a bug. This is a great technique to test a new implementation against an old (fully tested) implementation which can be used as oracle. We run the test and Pex comes back with this list of test cases. For example, the format string “\0\0{{{{“ leads to different output for both implementation. From the different outputs, “\0\0{{“ vs “\0\0{{{{“, it seems that curly braces are escaped differently. If I wanted to dig deeper, I could also simply debug the generated test case.


Comparing All Implementations

Now that we’ve seen that Phil and Scott were not agreeing, could we apply this to the other formatters. I quickly set up a T4 template to generate the code for all the parameterized unit tests between each formatters. Note that order matters for Pex: calling A then B might lead to a different test suite compared to B then A, just because Pex explores the code in different orders.

<#@ template language="C#" #>
string[] formatters = new string[] {
using System;
using Microsoft.Pex.Framework;
using Microsoft.Pex.Framework.Validation;
using Microsoft.VisualStudio.TestTools.UnitTesting;
using UnitTests;
using StringLib;

namespace UnitTests
    [PexClass(MaxConstraintSolverTime = 2, MaxRunsWithoutNewTests = 200)]
    public partial class StringFormatterTestsTest
        <# foreach(string left in formatters) { 
           foreach(string right in formatters) {
               if (left == right) continue;    
        public void <#= left #>Vs<#= right #>(string format, object o)
                () => format.<#= left #>Format(o),
                () => format.<#= right #>Format(o)
        } #>

The answer that Pex returns is interresting and scary: none of the implementation behaviors match. This is somewhat not surprising since they all use radically different approaches to solve this problem: regex vs string api etc…

Try it for yourself.

The full modified source is available for download here. You will need to install Pex from http://research.microsoft.com/pex/downloads.aspx to re-execute the parameterized unit tests.