Lang .NET 2006

Erik Meijer has posted an announcement for Lang .NET 2006, a Microsoft-sponsored language conference, over on Lambda the Ultimate that I thought I would point out to any readers who might be interested. A bit of the announcement:

Lang .NET 2006 is a forum for discussion of programming languages, managed execution environments, compilers, multi-language libraries, and integrated development environments. It provides an excellent opportunity for programming language implementers and researchers from both industry and academia to meet and share their knowledge, experience, and suggestions for future research and development in the area of programming languages.

Lang.NET 2006 will be held from August 1-3 at Microsoft, and Erik has assured me that he is going to press me in to service as a speaker, so be warned… Actually, we’ve got more than a few interesting non-LINQ things in the pipeline that maybe we’ll be able to talk more about by then. (Here’s hoping.) So hope to see you all there!

Reason #17 why ShipIt awards are better than product boxes?

Because it’s much less likely that some a**hole is going to come in to your office and steal your ShipIt awards. So far I’ve recieved eight product boxes, and four of them have been stolen out of my office. Three of them (VB 6.0, VB 2002 and VB 2003) disappeared out of my office sometime in the last week. What possible benefit anyone could get from old product boxes is beyond me…

(More context on this entry can be found here.)

My developers are smarter than your developers!

Over on his blog John Montgomery shares how some of Microsoft’s internal research puts the lie to the old canard that VB programmers aren’t as smart as other programmers. A short quote:

If you ask VB developers how much education they have, about the same have college degrees as C/C++ developers (we ask what their primary language is and cross reference by level of education attained), and only negligibly more C/C++ developers have graduate degrees. More than that, more pro developers who primarily use VB say they have an undergrad CS degree than pro developers who use C/C++ primarily (though more C/C++ primary devs say they have an engineering degree). In other words, C/C++ pro devs are more likely to have an engineering degree than VB devs (it’s outside the margin of error, but just barely).

Longer entry is here.

Filling the pipeline…

I was on a thread today where a VB MVP asked if VB was ever going to integrate regular expression-type functionality directly into the language to make it easier (and more comprehensible) to work with strings. The immediate reaction in my head was, “Well, it’s in the pipeline, but no telling if it’ll ever come out…” Which bears a little explanation…

The traditional model for software development at Microsoft is something I’d call “punctuated insanity.” That is, you start a product version by running around like a chicken with your head cut off, trying to come up with what you think you should do in that version. Because you only just started thinking about what you think you need to do, the schedule goes through a phase of rapid expansion as people try and flesh out the initial ideas (which always cost more the more you think about them) and try to jam in all the other good ideas they had (because they know it’s going to be another couple of years before they’ll get to put new stuff into the product again). This is the “insanity” phase of the product, where reality takes an extended vacation and the product plan goes all over the place.

Usually, after some period of insanity, reality (a.k.a. management) comes back from its vacation, finds the house trashed, the fridge empty and the car at the bottom of the pool and freaks out. At this point, some combination of feature paring and schedule slippage happens as the product plan slowly starts to come back to something that is actually shippable. (The product likely goes through several rounds of this before making it to RTM.)

The problems with this model are pretty obvious on the back end—schedule slips and feature cuts are never a happy thing from either the company’s or customer’s perspective. But I think more damage is done on the front end of things. Because planning only ever happens at the beginning of a product cycle, our ability to react to ongoing changes in the market place is seriously compromised. Even worse, because all planning happens in a very short burst it’s very likely that any sufficiently complex feature isn’t going to get the kind of design time that it really needs. Put another way, features only get designed once and you have to hope the feature is right, because you have only a limited ability to react to external and internal feedback once you get the feature finished and you can actually use it.

There are enough obvious examples of this process in action that I don’t even need to bother to point them out.

Much better, when you can do it, is the “pipeline” model of software development that you see much more often in our web offerings. In the pipeline model, you’re never working on just one release. While the main bulk of the team may be working on the next release, you’ve always got some number of people working on features that are further back in the pipeline. What this means is that, first of all, you have people who are actively looking ahead beyond the current release and can start reacting to changes in the marketplace before you’ve finished the current version. It also means that more complex features can have longer “bake” times in the pipeline, allowing teams to try stuff out, see how it flies and discard or modify designs that don’t work. The result should be a more complete, more polished set of features when they finally reach the front of the pipeline (assuming they aren’t discarded entirely before that time).

LINQ has so far been a good example of pipelining in action, and it’s something that we’re actively trying to adopt within the VB team itself, at least on the language side. Regular expressions has been one of those things that have been hanging out there on the “consider one of these days” list for a long time. My hope is that as our pipeline gets going, these ideas will finally get their chance to be tried out and then either discarded or moved towards actual implementation. I can’t say whether regular expression integration will be a good idea or not, or whether it’ll ever show up in VB. But with a more forward-looking approach, I’m hoping we’ll find out…

Local variables: scope vs. lifetime (plus closures)

Over a month ago, I asked what a particular chunk of code should do:

Module Module1

    Sub Main()

        For i As Integer = 0 To 2

            Dim x As Integer

            Console.WriteLine(x)

            x += 1

        Next

        Console.ReadLine()

    End Sub

End Module

I purposefully left the question open and vague because I wanted to see what the community feedback would be without any kind of preconceived notions. I didn’t expect for it to take me so long to return to this question, so I apologize if people got frustrated waiting, but I do want to get back to why I asked and what I think about the whole question. Let’s start by getting what actually happens out of the way: the program prints “0 1 2”. The reason for this takes a little bit of explaining.

What’s important here are two related but different ideas: scope and lifetime. The scope of a variable decides where a variable’s name can be used in a program. The lifetime of a variable decides how long the storage for that variable exists in memory. (In most programming languages the scope of a local variable is at very least a subset of the lifetime of the variable, otherwise you’d be able to refer to the local variable after its storage goes away, which would be bad.)

So the question now boils down to: what’s the lifetime of a local variable in VB? Most people who assumed that the answer would be “0 0 0” made the reasonable assumption that the lifetime of a local variable is the same as the scope of the local variable. So they expected that when the code reached the end of the For…Next block, they’d reached the end of both the scope and the lifetime of the local variable x and the storage for x would go away. Then, when the loop started up again, we would give you a whole new storage location for x that (like all storage locations) was initialized to zero.

However, those of you who tried it out discovered that in VB the lifetime of a local variable does not equal its scope. In fact, the lifetime of a local variable is from the beginning of a method all the way through to the end of a method, regardless of the variable’s scope. Even though x is only in scope within the For…Next loop’s statement block, it lives throughout the entire method. Thus, when you loop, you get the same storage location as you got the last time. And thus you get “0 1 2” instead of “0 0 0”. And, in fact, this is consistent with the way the Common Language Runtime works. When you define a method, you declare the locals that the method is going to use. When you enter the function, the CLR creates storage for those local variables and initializes them to zero. And when you exit the function, the CLR throws away the storage for those local variables. So VB is actually entirely in sync with what it’s platform does. And it’s the same for C#, only they finesse the issue — since you have to explicitly initialize all locals in C#, there’s no way to observe whether the lifetime of a local variable extends beyond it’s scope. But their local variables live just as long as the ones in VB.

This whole discussion is something of a minor point, at least until you get to closures, that is. What are closures, you ask? Well, the best way to explain them is by example. Let’s say you’ve got code that looks like this:

Sub Main()

    Dim value As Integer

    Dim xs = { 1, 2, 3, 4 }

 

    value = 2

    Dim ys = Select x From x In xs Where x < value

 

    For Each y As Integer In ys

        Console.WriteLine(y)

    Next

    Console.ReadLine()

End Sub

You’ll notice here that the query references the local variable “value”. Those of you well versed in the intricacies of LINQ will know, however, that the way LINQ works is that it pulls the expression “x < value” off into a function, a delegate of which gets passed to the Where method. Then the Where method uses this delegate to determine which members of the xs collection are filtered out. But how can we pull out the expression “x < value” to another method when the expression refers to a local variable? One method can’t see another method’s locals! Or can it…?

What happens in this case is we use a closure. A closure is just a special structure that lives outside of the method which contains the local variables that need to be referred to by other methods. When a query refers to a local variable (or parameter), that variable is captured by the closure and all references to the variable are redirected to the closure. So the statement “value = 2” assigns the value 2 to a variable location in a closure, not a variable location on the stack. Since the closure lives outside of the method, methods created by a LINQ query can legally refer to the local variables captured in the closure. And it all just works.

I’m purposefully skipping over a lot of the nitty-gritty of how closures work to avoid writing a whole chapter on this subject, but the practical upshot of this is that with closures, the lifetime of a local in an inner block becomes a whole lot more important. Let’s go back to a modified version of our original code:

Module Module1

    Sub Main()

        Di
m
queries(2) As IEnumerable(Of Integer)

        Dim xs = { -2, -1, 0, 1, 2 }

 

        For i As Integer = 0 To 2

            Dim y As Integer = i

            queries(i) = Select x From x In xs Where x <= y

        Next

 

        For Each q As Integer In queries(0)

            Console.WriteLine(q)

        Next

        Console.ReadLine()

    End Sub

End Module

The intent of this code is to create an array of queries that have different upper bounds — so queries(2) will return all values less than or equal to 2, queries(1) will return all values less than or equal to 1, and queries(0) will return all values less than or equal to zero. At least, that’s the intent. But if you go try this on the current LINQ code on my machine (not sure if it’ll run on the latest CTP or not), you’ll actually get the following result: “-2 -1 0 1 2”. Huh? The problem is that, if you’ll remember, the variable y lives for the entire method. Each iteration of the loop doesn’t get its own copy of y, it gets the same copy of y that every other iteration gets. This means, though, that when the query captures the local variable y, each iteration of the loop captures the same copy of y. Which means that when y gets changed inside of the loop, all the queries’ copy of y gets changed. All of the queries are going to return the same set of values.

What you really want in this case is for each iteration of the loop to capture a unique copy of y. In other words, you want to treat y as if its lifetime was only the inner part of the loop, not the whole method. And if you look at what C# does with anonymous delegates (and, now, lambda expressions), you’ll see this is what they do — since they require definite assignment, they can behave “as if” variables in inner scopes have shorter lifetimes than the entire method (even though they really don’t). To accomplish this, they have to use nested closures, which is beyond the scope of this entry and is left as an exercise to the reader (for the moment, at least).

So, the practical upshot is that with the introduction of closures to VB (regardless of whether we expose lambda expressions, which is still a bit of an open question), we’ve got a problem with local variable lifetime. We could use our flow analysis, introduced in VB 2005 for warnings, to perhaps finesse this issue the way C# does, but there are some complications. It’s very much an open issue, which is why I really wanted to see what people’s expectations were — it’s really useful data for understanding how people (at least those who read my blog) think about the problem.

Expect more down the road once we’ve got more of a handle on the problem, and kudos to anyone who made it this far

Updated 3/29/06: Corrected code error!

New Sample: VBParser 8.0

OK, it took a while, but my updated version of the VBParser source code sample is finally up! Since the 7.1 version of VBParser ended up being essentially a read-only project on GotDotNet, I’ve decided to eschew GDN and just go with a straight source download. I’ve added a new set of links on the left hand side of my blog entitled “Samples,” and put both the 7.1 and 8.0 version of VBParser there.

Things that are new or changed in VBParser 8.0:

  • I’ve updated the source to take advantage of VB 8.0 features such as generics, IsNot, etc. Because of this, VBParser 8.0 is not API compatible with VBParser 7.1. Any code that’s built with VBParser 7.1 will need to be updated, but it shouldn’t be too painful of a task.
  • As the name implies, VBParser 8.0 should parse all of the new VB 8.0 features. There is also a 7.1 compatibility mode that you can use to parse VB 7.1 code (which is a strict subset of VB 8.0).
  • This is a sample that I wrote on my own, which means that the only testing that it got was the testing that I did on my own. As such, there are no guarantees made of completeness, correctness or suitability for any particular purpose. I did try my best to make sure that it works well, but you should consider the sample as being in permanent beta. There may be stupid bugs still lurking in the source.
  • Please drop me a line with any and all bugs that you find and I’ll be happy to take a look at them and fix them as time permits. If you end up using VBParser for something, let me know!

You can get VBParser 8.0 here. [07/12/2014: The sources for VBParser 8.0 are now on GitHub.] You can get VBParser 7.1 (the previous version) here. [07/12/2014: The sources for VBParser 7.1 are now on GitHub.] Hope people find it useful!

Firefly: Letting go…

One of the things I’ve been meaning to do for a long time now is put an official closer on my earlier Firefly entries. So let’s do it.

I saw Serenity a while back when it was still in the theaters, and I did really like it. However, when I walked out it was clear to me that it was over. I don’t exactly know why I felt that way, but I did. Maybe it was the character deaths in the movie, maybe it was the fact that the movie seemed to compress what should have been at least a season or more worth of episodes into two hours, maybe it was the fact that Firefly was really designed to be on the small screen and just didn’t translate well enough to the big screen. I don’t know. Either way, the movie didn’t seem to do well enough to really keep the whole thing going, which is a shame, but there you are — at least we did get an idea of where a bunch of the ongoing sub-plots were headed. Those of you who don’t want to give up hope, head on over to www.fireflyseason2.com.

Since it usually comes up when discussing Firefly, I’ll also mention that I finally gave up on Battlestar Galactica. Now, I know lots of people absolutely love BG. I’m not telling you you’re wrong. I’m just saying that for some reason for me, it just didn’t do it. That’s the way things go.

Next up is going to see V for Vendetta. The trailers look as if they kept a lot of the storyline from the graphic novel. Will this be the first Alan Moore adaptation not to suck? We’ll see…

I haven’t gone AWOL…

…life just went and got very complicated for a while. In particular:

  • My wife Andrea has had some strange medical symptoms over the past few months that our neurologist thought indicated pretty strongly she has MS. So the past month has been spent getting MRIs and lumbar punctures to try and narrow down the diagnosis. The good news? It’s pretty unlikely she has MS or any of the related types of diseases (RA, Lupus, etc.). The bad news? Still not sure what’s going on. Still, since none of the symptoms are extremely serious, I’ll take “don’t know” for now, given the alternatives.
  • We’re pushing to get another LINQ preview out in the near future and I’ve been on the hook to write a number of new features for the preview. It’s been great to get the chance to write a lot of code again, but since that’s not my day job any more, it has contributed to a real time crunch.
  • The whole question of scripting languages, dynamic languages, dynamic environments, loose typing, etc, etc, etc. which has been bubbling around in the background for well over two years seems, all of a sudden, to be coming to quite a boil. Nothing in particular to talk about yet, but lots of interesting and exciting stuff bouncing around and perhaps some quite interesting things to talk about in the near future.

Anyway, I think I should be back now, barring any unexpected surprises. Time to get back to that question of local variables, eh?