One of the side projects that I’m working on in addition to this weblog is a managed scanner and parser for the Visual Basic .NET language. I started on the project because I’d really like to write some project analysis tools that I could use to determine things about how people use the language, but as it’s gone on I’m hoping to also release it into the community as a sample. (One of the wrinkles is that I’m “writing” the managed parser by looking at the existing unmanaged parser, so it’s not like the code I’m writing is completely of my own invention, although it is largely so.) I think that the more that people can work with a language in an automated way, and the more tools people can write for a language, the better it is for people who use that language. Anyway, we’ll just have to see. I still have to finish it first…
After I finished the scanner (scanners are easy), I started parsing expressions and have been working my way up the parse tree hierarchy. I’ve recently reached type members and, in particular, methods. An interesting thing is the way that VB’s line orientation interacts with MustOverride methods when parsing. In C#, parsing methods is pretty simple because after a method header you can see either a semicolon or an open curly brace. If it’s the former, then you’ve got an abstract method declaration; if it’s the latter, then you’ve got a concrete method declaration. Whether or not you specified “abstract” as a modifier on the declaration is really something for declaration semantics to sort out later on. In VB, though, the following fragment is ambiguous if you don’t look at the modifiers:
<modifiers> Sub Foo()
Dim Bar As Integer
In other words, once you’ve gotten through the method header, the next line starts with “Dim,” which could either mean that Foo was a MustOverride method and Bar is a field, or that Foo was a concrete method and Bar is a local. To figure out which is which, you absolutely have to look at the modifiers to see if “MustOverride” is there. This is the kind of thing that drives formal grammar writers nuts because it means you have to hork up your grammar productions to make it all come out right. It doesn’t make such a big deal, though, if you hand-code your parser, which is what we do for VB. (Hand-coded parsers vs table-driven parsers seems to be one of those religious arguments that language people get into. Let me just say I don’t take a formal position on the question.)
Interestingly, there are at least a few other places in the language (esp. in terms of object creation expressions versus array creation expressions) where things get complicated like this. But overall, it’s been pretty smooth sailing. I’ll let everyone know as the project progresses.