Saturday, July 14, 2007

Pex can analyze regular expressions*** and generate strings that matches them automatically!

What does it mean? Well, maybe, somewhere deep in your code your are validating some string with a regex (for example a url). In order to test the validation code, one needs to craft inputs that does not match (easy) and matches (harder) the regex.

Let Pex do it:

So what if Pex could be smart enough to understand a regex and craft inputs accordingly? In the example below, it would be very hard for a random generate to generate a string that matches the regex.

[PexTest]
public void Url([PexAssumeIsNotNull]string s)
{
    if (Regex.IsMatch(s, "(?<Protocol>\w+):\/\/(?<Domain>[\w@][\w.:@]+)\/?[\w\.?=%&=\-@/$,]*"))
        throw new PexCoverThisException(); // random won't find this
}

"" would be a failing match and foo://foo.com a good match. (To generate the correct match, my brain simulated the regex automaton and estimated one possible path). Interestingly, Pex generates....

Ā://Āā

Pretty ugly... but correct! This reminds us that while regex are used to validate input, what they'll let through is sometimes scary.

Compiled Regex + Pex = Love

The great part about supporting the regular expressions is that it comes for free (almost) since Regex can be compiled to IL in .Net. When the BCL generates the regex IL code, it effectily builds the automaton... which can be analyzed by Pex!!!

Refresher: Pex works analyzing the MSIL being executed.

Hey but not all Regex are compiled!

That's true. Compiling the regex is optional so Pex needs to do a little bit of 'plumbing' to make sure all regular expressions are compiled. This is simply done by substituting the real .ctor of the Regex class with a customized version that compiles the regex. I'll talk about substitutions deeper in the future.

*** Of course, the bigger the regex is, the harder it is going to be for Pex to craft a successful match.

Monday, July 16, 2007 2:59:30 AM UTC
Now *that* is cool!
Comments are closed.