Category Archives: Visual Basic 2008

Local variables: scope vs. lifetime (plus closures)

Over a month ago, I asked what a particular chunk of code should do:

Module Module1
    Sub Main()
        For i As Integer = 0 To 2
            Dim x As Integer
            Console.WriteLine(x)
            x += 1
        Next
        Console.ReadLine()
    End Sub
End Module

I purposefully left the question open and vague because I wanted to see what the community feedback would be without any kind of preconceived notions. I didn’t expect for it to take me so long to return to this question, so I apologize if people got frustrated waiting, but I do want to get back to why I asked and what I think about the whole question. Let’s start by getting what actually happens out of the way: the program prints “0 1 2”. The reason for this takes a little bit of explaining.

What’s important here are two related but different ideas: scope and lifetime. The scope of a variable decides where a variable’s name can be used in a program. The lifetime of a variable decides how long the storage for that variable exists in memory. (In most programming languages the scope of a local variable is at very least a subset of the lifetime of the variable, otherwise you’d be able to refer to the local variable after its storage goes away, which would be bad.)

So the question now boils down to: what’s the lifetime of a local variable in VB? Most people who assumed that the answer would be “0 0 0” made the reasonable assumption that the lifetime of a local variable is the same as the scope of the local variable. So they expected that when the code reached the end of the For…Next block, they’d reached the end of both the scope and the lifetime of the local variable x and the storage for x would go away. Then, when the loop started up again, we would give you a whole new storage location for x that (like all storage locations) was initialized to zero.

However, those of you who tried it out discovered that in VB the lifetime of a local variable does not equal its scope. In fact, the lifetime of a local variable is from the beginning of a method all the way through to the end of a method, regardless of the variable’s scope. Even though x is only in scope within the For…Next loop’s statement block, it lives throughout the entire method. Thus, when you loop, you get the same storage location as you got the last time. And thus you get “0 1 2” instead of “0 0 0”. And, in fact, this is consistent with the way the Common Language Runtime works. When you define a method, you declare the locals that the method is going to use. When you enter the function, the CLR creates storage for those local variables and initializes them to zero. And when you exit the function, the CLR throws away the storage for those local variables. So VB is actually entirely in sync with what it’s platform does. And it’s the same for C#, only they finesse the issue — since you have to explicitly initialize all locals in C#, there’s no way to observe whether the lifetime of a local variable extends beyond it’s scope. But their local variables live just as long as the ones in VB.

This whole discussion is something of a minor point, at least until you get to closures, that is. What are closures, you ask? Well, the best way to explain them is by example. Let’s say you’ve got code that looks like this:

Sub Main()
    Dim value As Integer 
    Dim xs = { 1, 2, 3, 4 }
 
    value = 2
    Dim ys = Select x From x In xs Where x < value
 
    For Each y As Integer In ys
        Console.WriteLine(y)
    Next
    Console.ReadLine()
End Sub

You’ll notice here that the query references the local variable “value”. Those of you well versed in the intricacies of LINQ will know, however, that the way LINQ works is that it pulls the expression “x < value” off into a function, a delegate of which gets passed to the Where method. Then the Where method uses this delegate to determine which members of the xs collection are filtered out. But how can we pull out the expression “x < value” to another method when the expression refers to a local variable? One method can’t see another method’s locals! Or can it…?

What happens in this case is we use a closure. A closure is just a special structure that lives outside of the method which contains the local variables that need to be referred to by other methods. When a query refers to a local variable (or parameter), that variable is captured by the closure and all references to the variable are redirected to the closure. So the statement “value = 2” assigns the value 2 to a variable location in a closure, not a variable location on the stack. Since the closure lives outside of the method, methods created by a LINQ query can legally refer to the local variables captured in the closure. And it all just works.

I’m purposefully skipping over a lot of the nitty-gritty of how closures work to avoid writing a whole chapter on this subject, but the practical upshot of this is that with closures, the lifetime of a local in an inner block becomes a whole lot more important. Let’s go back to a modified version of our original code:

Module Module1
    Sub Main()
        Di

m queries(2) As IEnumerable(Of Integer)
        Dim xs = { -2, -1, 0, 1, 2 }
 
        For i As Integer = 0 To 2
            Dim y As Integer = i
            queries(i) = Select x From x In xs Where x <= y
        Next
 
        For Each q As Integer In queries(0)
            Console.WriteLine(q)
        Next
        Console.ReadLine()
    End Sub
End Module

The intent of this code is to create an array of queries that have different upper bounds — so queries(2) will return all values less than or equal to 2, queries(1) will return all values less than or equal to 1, and queries(0) will return all values less than or equal to zero. At least, that’s the intent. But if you go try this on the current LINQ code on my machine (not sure if it’ll run on the latest CTP or not), you’ll actually get the following result: “-2 -1 0 1 2”. Huh? The problem is that, if you’ll remember, the variable y lives for the entire method. Each iteration of the loop doesn’t get its own copy of y, it gets the same copy of y that every other iteration gets. This means, though, that when the query captures the local variable y, each iteration of the loop captures the same copy of y. Which means that when y gets changed inside of the loop, all the queries’ copy of y gets changed. All of the queries are going to return the same set of values.

What you really want in this case is for each iteration of the loop to capture a unique copy of y. In other words, you want to treat y as if its lifetime was only the inner part of the loop, not the whole method. And if you look at what C# does with anonymous delegates (and, now, lambda expressions), you’ll see this is what they do — since they require definite assignment, they can behave “as if” variables in inner scopes have shorter lifetimes than the entire method (even though they really don’t). To accomplish this, they have to use nested closures, which is beyond the scope of this entry and is left as an exercise to the reader (for the moment, at least).

So, the practical upshot is that with the introduction of closures to VB (regardless of whether we expose lambda expressions, which is still a bit of an open question), we’ve got a problem with local variable lifetime. We could use our flow analysis, introduced in VB 2005 for warnings, to perhaps finesse this issue the way C# does, but there are some complications. It’s very much an open issue, which is why I really wanted to see what people’s expectations were — it’s really useful data for understanding how people (at least those who read my blog) think about the problem.

Expect more down the road once we’ve got more of a handle on the problem, and kudos to anyone who made it this far …

Updated 3/29/06: Corrected code error!

Updates to the XML integration syntax

Updated VB 9.0 (LINQ + XML) preview out!

10 Replies

I know this is old news, but I’ll say it anyway: several weeks ago, we released an updated preview of our proposed 9.0 features. The preview is enhanced in four primary ways:

We now support some Intellisense for Select expressions. This is a step forward in our investigation of the Select/From vs. From/Select question, so we’re definitely interested in feedback here.
We now support a lot more LINQ, specifically DLinq and variable capture (so you can now access local variables in queries). A huge chunk of my November/December went towards implementing lambda expressions and expression trees in Visual Basic and that, combined with some excellent work by another team member on variable capture, means a lot of stuff works now that didn’t before.
The editing/display experience for XML literals has been greatly enhanced — just having colorization makes a huge difference.
A bunch of extensions were made to the XLinq support to make working with namespaces and elements easier.

Amanda covers this in more detail in her entry, but this should give you a flavor. Eagle eyed readers will note that I said that I did a lot of work on “implementing lambda expressions,” but it’s important to realize that at this point lambda expressions are only used as a part of query comprehensions — there is no explicit syntax for lambda expressions yet. The priority was on getting the DLinq support working, and now that we’ve got that, we’re moving on to flushing out some of the remaining questions like lambdas…

Hope you enjoy!

Update 02/08/2006: I also forgot to mention that Amanda and I did a MSDN TV episode talking a bit about the new CTP.

Extension methods + late binding = trouble?

1 Reply

I’m clearing out some old mail and I came across a reference to a blog entry by Jon Skeet about extension methods that I saved a while back. He says:

One of the things I don’t like about the proposed extension methods is the way the compiler is made aware of them – on a namespace basis. “Using” directives are very common to add for any namespace used in a class, and quite often I wouldn’t want the extension methods within that namespace to be applied. I propose using using static System.Query.Sequence; instead, in a way that is analogous to the static imports of Java 1.5 (except without importing single members or requiring the “.*” part. This would make it clearer what you were actually trying to do.

The interesting thing is, if you look at the document that we released on LINQ at the PDC, you’ll see that our design for extension methods incorporates this suggestion: extension methods in VB are brought into scope by directly importing the containing type, not the namespace. Imports System.Query isn’t sufficient to get LINQ methods in scope; you have to say Imports System.Query.Sequence. I agree with Jon that it’s clearer to do it this way, but that’s not the whole reason we did it. You see, the real problem is late binding.

Yes, late binding. We still support that, you know? And if you think for a minute about late binding and extension methods the way C# does them, you’ll quickly see that the two things don’t go together very well. When we go to late bind a member “foo” on an instance of a type “bar” today things are relatively simple — we gather all the members of “bar” with the name “foo” and then apply our regular binding rules to determine which, if any, of them fit the bill. All we need to know at run-time is the type of the instance we’re late binding on. With extension methods, though, this breaks down. Now we need to know not just the type of the instance, but also all of the extension methods in scope at the point of invocation. That’s because if “bar” doesn’t have a member “foo,” then in the early-bound case the compiler is going to go looking for extensions method. And the late-binder needs to do this, too!

If you look at how C# does extension methods, they go out and start looking in all the enclosing namespaces for extension methods, then look in all the imported namespaces for extension methods, etc. Replicating this at run-time would be difficult, at best — at every late-bound invocation point you would have to capture the complete binding context. And this binding context would change from method to method. What a nightmare! Our design is more friendly to late-binding. Our current design says that if “bar” doesn’t have a member “foo,” then we’ll look only at types whose members have been imported for extension methods. This collapses down the search space hugely and also means that the binding context is per-file (since we only allow file-level imports, unlike C#). While still a bit bulky, this seems much more manageable. Although we’ll see — although we’ve implemented early-bound extension methods, we haven’t gotten to the late bound stuff yet. <g>

Although we felt we had to do extension methods this way because of late-binding, we also believe there are some other advantages to the scheme. Jon lists one: it becomes much easier to be clear about the extensions you are using. You won’t import some useful namespace and then, whoops!, you just added a whole bunch of extension methods that you didn’t want. It’s also fairly congruent with the fact that VB (again, unlike C#) allows you to import a type directly, allowing access to its shared members without qualification.

Of course, now that I think of it, I’m not sure how this works with standard modules and their special binding rules. Hmmmm. I’m going to have to look at that…

VB LINQ preview now updated for VB 2005 RTM

7 Replies

Now that VB 2005 has released to manufacturing (RTMed in TLA-speak), we’ve updated our installer for the VB LINQ preview. You can find it here. I think the pages at http://msdn.microsoft.com/vbasic/future won’t be updated to point to it until next week because of other changes they’re making to that sub-site after RTM, but for now you can click directly on the link above. Feel free to get the word out! The actual bits are the same as the PDC release — no new functionality — but rest assured we’re also working on an updated release with more features in the near future!

Updated 11/2/05: Fixed hyperlink (had a trailing “.”).

PDC videos online

1 Reply

You’ve probably seen this elsewhere, but… The PDC05 videos are now online and available to anyone who wants to watch them for (I believe) the next six months. You can catch my sessions there in case you missed the fun the first time around!

TLN308: Visual Basic: Future Directions in Language Innovation

PNL03: Scripting and Dynamic Languages on the CLR

PNL11: .NET Language Integrated Query End-to-End

Mark your calendars: VB 9.0 chat

Forest vs. trees

16 Replies

I notice that Darryl Taft has a very interesting article today over on eWeek asking “Will VB 9 Win Over the VB 6 Faithful?”. I think the headline is a bit off, since the real question is “Will VB 2005 Win Over the VB 6 Faithful?” given VB 9.0’s status as almost-entirely-vaporware at this point. The answer to that question is “yes, definitely, in my opinion,” but time will tell. Only once we know what happens with VB 2005 will we really be able to start talking about how VB 9.0 will or won’t change that new status quo.

There was one section that caught my eye, though:

Will Microsoft’s play to win back developers be successful?

Joel Spolsky, founder of Fog Creek Software in NY, said he thinks Microsoft should keep things simple.

“My impression is that the developers who appreciated VB for its simplicity and easy learning curve have long since given up and aren’t going to be impressed by language ‘featuritis,’ and are certainly not going to be pleased by a raft of complex new language features to learn,” Spolsky said.

“Those who might be excited by something like LINQ are the exact kind of programmers who already switched to C#, or, increasingly, Python. I fear that by adding these complex new language features VB.Net stands to alienate what’s left of its core constituency who just want to get things done.”

This is the downside of announcing these features at the PDC, which is a über-geek conference, instead of someplace like TechEd, which is more of a conference for normal human beings (or at least as normal as programmers ever get). Yes, indeed — if you go and download our VB 9.0 introductory paper, you will be regaled with all kind of cool geeky language features like type inference and anonymous types and query comprehensions! All aimed at the kind of language wonks who are going to be going to the PDC and downloading such papers so they can dissect the semantics of each feature on their blogs. This is fine as far as it goes, but the problem with all this is that then people like Joel start to lose sight of the forest for the trees because they start to think that the average user is going to actually have to give a damn about all this wonky stuff to be able to get their work done.

The truth is that if you go back to the halcyon days of VB 6.0, you’ll find a phenomenal amount of complexity in the language. Just go read a book like this one and you’ll find out just how many complex features went into making simple things work in VB. And yet, all that complexity and featuritis was hidden under a fairly approachable and simple facade that allowed people to, as Joel puts it, “just get things done.” This is really no different. The practical upshot of LINQ is that you’ll be able to take objects that you already work with every day and easily write queries over them. That’s about it. You won’t have to understand the mechanics of anonymous types, lambda expressions, type inference, extension methods, or query comprehensions to make it work. You won’t even have to know what any of those things are. All you have to know is that you can type “Select” and write a query. Something, I might add, that most VB users already have to do in one way, shape or form already. In many ways, the additional semantic overhead for many users should be almost zero.

The only difference between now and VB 6.0 days is that in VB 6.0 days we didn’t go to the PDC and talk about how VB was going to support the new IFoo interface and the new IBar interface and extend ITypeInfo so that we could support this feature and change IDispatch so that we could support that feature. No, instead all those details were relegated (if discussed at all) to C++ sessions about new COM features and VB just didn’t bother to show up. Now, instead of saying “you’ve got to learn C++ if you want to understand how the nitty-gritty of this language works,” we’re just saying, “if you want to know, you’re all adults, here it is.” But, hey, if you don’t care about all the new whizzy language features — and, let’s face it, most non-language geeks shouldn’t — then don’t. LINQ will just work fine and your life only gets easier, not harder.

It’s that simple.

Everything old is new again…

3 Replies

In one of the comments to the “Introducing LINQ” entry that I wrote, Unilynx wrote:

Sounds like what we’ve been doing for five years already 🙂

This was a comment that came up several times at the PDC from various sources: “What’s so revolutionary about this stuff? We’ve been doing this kind of thing for years!” On the one hand, what’s unique about LINQ is how it’s built, it’s openness and flexibility, and it’s unification of data querying across domains. But on the other hand, yeah, let’s be honest: as Newton would say, if we’re seeing further, it’s only because we’re standing on the shoulders of giants. My standard response to this line of thought is: there are really only 15 good ideas in computer science and all of them were discovered thirty years ago or more. What happens is that the programming world just rediscovers them over and over and over again, each time prentending like the ideas are brand new.

Erik Meijer had a good comment in the languages panel that if you want to know what the next big thing in programming is going to be, all you have to do is look at what was hot twenty years ago. Because that tends to be the length of time it takes for the wheel to turn a full crank…

VB Rocks .NET Rocks Again