Background compilation, part 1

Roy points to Philip’s complaint that VB still exhibits problems with multi-language solutions that have been around since the VS 2002 beta. Philip’s completely correct, and the explanation of why this bug still hasn’t been fixed even though we’ve known about it since before VS 2002 shipped bears some explanation. Specifically, the problem is with a mistake we made when designing our background compilation system a very long time ago. Since I’ve been asked more than a few times about how background compilation works, this is an excellent chance to delve into that subject. So let me talk about background compilation for a while and then we’ll get back to Philip’s bug.

“Background compilation” is the feature in VB that gives you a complete set of errors as you type. People who move back and forth between VB and C# notice this, but VB-only developers may not realize that other languages such as C# don’t always give you 100% accurate Intellisense and don’t always give you all of the errors that exist in your code. This is because their Intellisense engines are separate, scaled-down compilers that don’t do full compilation in the background. VB, on the other hand, compiles your entire project from start to finish as Visual Studio sits idle, allowing us to immediately populate the task list with completely accurate errors and allowing us to give you completely accurate Intellisense. As Martha would say, it’s a good thing.

However, doing background compilation is a tricky prospect. The problem is that just as soon as you’ve finished compiling the project in the background, the user is likely to do something annoying like edit their code. Once they do that, the application you just finished compiling is now incorrect – it doesn’t reflect the current state of the user’s code anymore. So, the question is: how do you handle that? The brute force way would be to throw away the entire result of the compilation and start over again. However, since Intellisense depends on compilation being mostly complete, this is impractical – given a reasonably large project, you may never get the chance to give Intellisense because by the time you’re almost done recompiling the whole project, the user has had the chance to type in another line of code, thus invaliding all the work you just did. You’ll never catch up.

To deal with this, we implement a concept we call “partial decompilation.” When a user makes an edit, instead of throwing the entire compilation state away, we figure out the smallest amount of stuff we can throw away and then keep everything else. Since most edits don’t actually affect the project as a whole, this means we can usually throw out minimal information and get back to being fully compiled pretty quickly. Here’s how we do it: each file in the project is considered to be in one of the following states at any one time:

  • NoState: We’ve done nothing with the file.
  • Declared: We’ve built symbols for the declarations in the file, but we haven’t bound references to other types yet.
  • Bound: We’ve bound all references to types.
  • Compiled: We’ve emitted IL for all the properties and methods in the file.

When a project is compiled, all the files in the project are brought up to each successive state. (In other words, we have to have gotten all files to Declared before we can bring any file up to Bound, because we need to have symbols for all the declarations in hand before we can bind type references.) When all the files have reached Compiled, then the project is fully compiled.

Now let’s say that a user walks up to a project that’s reached Compiled state and makes an edit to a file. The first thing that we have to do is classify the kind of edit that the user made. (Keep in mind that “an edit” can actually be an extremely complex one if the user chose to cut and paste one block of code over another block of code.) Edits can generally be broken down into two classifications:

  • Method-level edits, i.e. edits that occurs within a method or a property accessor. These are the most common and also the easiest to deal with because a method-level edit can never affect anything outside of the method itself.
  • Declaration-level edits, i.e. edits that occur in the declaration of a type or type member (method, property, field, etc). These are less common and can affect anyone who references them or might reference them anywhere in the project.

When an edit comes through, it’s first classified. If it’s a method-level edit, then the file that the edit took place in is decompiled to Bound. This involves the relatively small work of throwing away all the IL for the properties and methods defined in the file. Then we can just recompile all the methods and we’re back to being fully compiled. Not a lot of work. Say, though, that the edit is a declaration-level edit. Now, we have to do some more work.

Earlier, when we were bringing files up to Bound state, we kept track of all the intra-file dependencies caused by the binding process. So if a file a.vb contained a reference to a class in b.vb, we recorded a dependency from a.vb to b.vb. When we go to decompile a file that’s had a declaration edit, we call into the dependency manager to determine what files depend on the edited file. We then decompile the edited file all the way down to NoState, because we have to rebuild symbols for the file. Then we go through and decompile all the files that depend on the edited file down to Declared, because those files now have to rebind all their name references in case something changed (for example, maybe the class the file depended on got removed). This is a bit more work, but in most cases the number of files being decompiled is limited and we’re still doing a lot less work than doing a full recompile.

This is kind of the high-level overview of how it works – there are lots of little details that I’ve glossed over, and the process is quite a bit more complex than this, but you get the idea. I’m going to stop here for the moment and pick up the thread again in a few days, because there’s a few more pieces of the puzzle that we have to put into place before we get to explaining the bug.

13 thoughts on “Background compilation, part 1

  1. Mike Dimmick

    Another problem I’ve had is that the environment can’t seem to cope with two classes with the same name in different projects (building different assemblies) in the same solution – it throws a wobbly when compiling:

    Form1.vb(1) : error BC30175: class ‘Form1′ and class ‘Form1′, declared in ‘C:Documents and SettingsmikeMy DocumentsVisual Studio Projectsvbtest1Form1.vb’, conflict in namespace ‘vbtest1′.

    Reply
  2. Pingback: .NET From India

  3. Pingback: help.net

  4. Pingback: VS DATA Team's WebLog

  5. Pingback: Sijin Joseph's blog

  6. Rui Quintino

    I’ve been facing problems related to this for a long time… without discovering any plausible solution, so please do continue. :) I would love to learn something that I could use to solve some issues and keep everyone happy.

    I’ve been working almost exclusively with .Net (C# and VB) for a few years and, although I usually prefer VB, I must say that I advise my teams to avoid VB if they expect to have a relatively large code base in the future. And that’s because, from experience gained in some large scale projects, VB.Net IDE just don’t scale due to the background compilation. VS becomes *unusable*.

    Even altough solutions are correctly separated through severall projects, and developers only work with a subset of projects they need… it’s still slow and extremely unconfortable working with VS & VB.NET.

    Using compilled dlls is not an option I take seriously. Yes, we already use dlls references on stable resources but not on frequently changing code.

    I’ve seen cases where developers just prefer copying the code to change to notepad, do all the work… without intellisense, and then pasting it again to VS.

    Seriously, what I really want to know:
    -Am I missing something? Am I wrong? Some recent posts on this:

    http://groups.google.pt/groups?hl=pt-PT&lr=&ie=UTF-8&oe=UTF-8&selm=O9c6Ayu9DHA.2656%40TK2MSFTNGP11.phx.gbl

    http://weblogs.asp.net/rpooley/archive/2003/07/30/21818.aspx

    -Is this performance problem clearly resolved in Whidbey?
    -Can we disable it? (some secret tweak..?)
    -Any other tips?

    Reply
  7. Eric Mutta

    A very informative account Paul, thanks :-).

    One suggestion about background compilation. How about you allow the user to customize just how far the compilation goes?

    You could, for instance, have the following levels:

    *level 1
    this would be the minimum and includes lexical and syntactical analysis. This would allow intellisense, the drop downs, object browser and class viewer to work and would allow early reporting of a large number of common errors.

    *level 2
    this may add basic semantic checks for use of keywords in certain contexts (e.g Exit Function when inside a Sub, using On Error inside a procedure with Try-blocks, specifying a non-interface type in an Implements declaration etc.)

    *level 3
    this is the highest and more advanced things like type checking, name scoping, overload resolution etc.

    The idea is that folks who work on projects large enough to make background compilation choke, are less likely to make mistakes that would be spotted in the higher compilation levels and can therefore turn those off (but get all the productivity goodies of level 1).

    The later stages of compilation (syntax tree preprocessing, IL code generation, generation of PE modules etc), could be a level 4 option, but are somewhat unnecessary as they are needed only when one is about to execute the code.

    Just a thought :-)

    Reply
    1. paulvick

      Eric: Unfortunately, we already factor this as much as possible. As it turns out, things like Intellisense need a lot more information than just syntatic and lexical analysis. You end up really needing full bound information so you can determine what names mean and what members they have, and what the types of things are.

      Reply
  8. Pingback: Richard Clark

  9. Pingback: Corrado's BLogs

  10. Pingback: Panopticon Central

  11. Pingback: Panopticon Central

  12. kreditrechner

    I’ve had is that the environment can’t seem to cope with two classes with the same name in different projects (building different assemblies) in the same solution – it throws a wobbly when compiling: Form1.vb(1) : error BC30175: class ‘Form1′ and class ‘Form1′, declared in ‘C:Documents and SettingsmikeMy DocumentsVisual Studio Projectsvbtest1Form1.vb’, conflict in namespace ‘vbtest1′.

    Reply

Leave a Reply