# Sunday, July 08, 2007

I finally posted a project that I had been developing last year in my daily commutes: Toad (www.codeplex.com/toad).

Toad is an (experimental) compiler infrastructure to build languages top of .Net, without knowing much about IL. Mostly it was a way for me to learn first hand all the little subtleties of writing multiple language on top of .Net.

What does Toad give you?

  • A dynamic console that supports multiple languages,
  • An Expression/Statement tree that can be translated back to IL,
  • Built-in debugging capabilities,
  • A set of Visitors on top this AST that implement various optimization and language constructs (this is the fun part),
  • There's a lot of stuff, I used to take the bus twice a day. :)
  • Toad does not use the DLR maybe in the future.

What is it good for?

  • Toy around with IL and try to build your micro language.
  • Don't use it for anything serious.

Good For Nothing Language

 The 'Good for nothing' language is a toy language that Joel Pobar used to illustrate managed IL compilers. With Gfn, you can write simple program such as:

var i = 123;
print i;
for j = i to 10 do print i+j; end;

The parsing stuff

I took Joel's example and wrote a grammar for the Gold parser. It looks like something like this that is then compiled by Gold:

<Program> ::= <BlockStatement>

<BlockStatement> ::= <BlockStatement> <Statement> ';'
            | <Statement> ';'

<Statement> ::= 'var' Identifier '=' <Expression>
    | Identifier '=' <Expression>
    | 'for' Identifier '=' <Expression> 'to' <Expression> 'do' <BlockStatement> 'end'
    | 'read_int' Identifier
    | 'print' <Expression>
    | 'return' <Expression>

At the end of the process, you get a visitor for the code AST. This is the boiler plate code to start using Toad to build the compiler:

// <Statement> ::= for Identifier '=' <Expression> to <Expression> do <BlockStatement> end
protected virtual object VisitRuleStatementforIdentifierEqtodoend(SyntaxNode node)
{
     throw new NotSupportedException("RuleStatementforIdentifierEqtodoend");
}

Implementing the 'for' loop

Let's take a look at the implementation of the for loop. We need to:

  • extract the from, to expressions,
  • declare a variable for the index,
  • build the predicate i < to etc...,
  • emit symbols for debugging,

The code below shows the full implementation of that statement:

// <Statement> ::= for Identifier '=' <Expression> to <Expression> do <BlockStatement> end
protected override object VisitRuleStatementforIdentifierEqtodoend(SyntaxNode node)
{
    // from, to
    Expression from = (Expression)this.VisitNode(node[3]);
    Expression to = (Expression)this.VisitNode(node[5]);

    // x = from
    VariableDeclarationStatement init = Stm.Var(node[1].Data, from);

    // body
    this.Context.PushScope();
    this.Context.PushVariable(init.Variable);

    Unit body = (Unit)this.VisitNode(node[7]);

    // i < to;
    Expression ilto = Expr.LessThanOrEqual(Expr.Var(init), to);
    ilto.Symbol = FromNode(node[1]);
    // ++i
    Statement inc = Stm.Expr(Expr.PrePlusPlus(Expr.Var(init)));

    this.Context.PopScope();

    ForStatement forstm = Stm.For(init, ilto, inc);
    forstm.Body = Stm.FromUnit(body);

    return forstm;
}

Hooking up the visitor in Toad

The last step is to package our visitor as a 'language' and return the generated statements or expression to Toad. We can spin up the interactive console, load Gfn and try it out:

image

Nothing really mind blowing here. Let's turn on live IL debugging to see what happening under the hood. This mode emits the IL source of the method, adds a bunch of 'nop' instructions to force the debugger to step on each IL instruction:

image

You can see in the generated code how the expressions got translated into IL instructions. 

posted on Sunday, July 08, 2007 3:23:32 PM (Pacific Daylight Time, UTC-07:00)  #    Comments [0]
Related posts:
Comments are closed.