Lambda expressions, Part II: Closures

Well, I’m glad to see that even with my writer’s block, people still seem to be reading the blog! Although there is definitely a diversity of opinion, the majority of people seem to prefer the “Function” syntax of the choices I laid out, which is not exactly what we expected. (We were wagering people would go for the more cryptic, compact syntax. Shows what we know…) That’s the syntax you should expect to see in the beta, and if public opinion shifts over time in the beta, we’ll deal with that feedback if and when we come to it. After all, that’s what a beta is for…

The other major topic to talk about with lambda expressions is closures. I don’t have a particular question this time, just an update on a topic we discussed about nine months ago. When I talked about closures back March, I raised the question of exactly how VB should treat local variables when lifting them into closures. I won’t rehash the entire discussion here–you can just go back and read the original entry itself–but the problem boiled down to something like this:

Module Module1
    Delegate Function AddDelegate(ByVal x As Integer) As Integer

    Sub Main()
        Dim lambdas(9) As AddDelegate

        For i As Integer = 0 To 9
            Dim y As Integer = i
            lambdas(i) = Function(x) x + y
        Next

        For Each l As AddDelegate In lambdas
            Console.WriteLine(l(0))
        Next

        Console.ReadLine()
    End Sub
End Module

This is the same example as the previous entry on closures, except that instead of using queries, I’m using lambda expressions directly. What I’m doing here is filling an array with lambda expressions that are supposed to add a particular value to the parameter. So lambdas(0) will add 0 to the parameter, lambdas(1) will add 1 to the parameter, etc. At least, that’s the intent. But now we run into the closure question that I asked originally–should each iteration get it’s own copy of the local variable y, or should they all share the same copy of y? If the former, I get the intended semantics. If the latter, then every lambda adds 9 to the parameter (because they share the same y and the final value of y is 9).

Just to make the problem clear, let’s look at an equally valid way (in VB) of writing the same code:

Module Module1
    Delegate Function AddDelegate(ByVal x As Integer) As Integer

    Sub Main()
        Dim lambdas(9) As AddDelegate

        For i As Integer = 0 To 9
            Dim y As Integer
            lambdas(i) = Function(x) x + y
            y += 1
        Next

        For Each l As AddDelegate In lambdas
            Console.WriteLine(l(0))
        Next

        Console.ReadLine()
    End Sub
End Module

Now instead of initializing y at the beginning of each iteration of the loop, I’m just incrementing it and letting the value carry over from one iteration of the loop to another. Now maybe we start to see the problem–if each iteration of the loop gets its own copy of y, then that seems to conflict with the idea that the value of the variable carries over from one iteration of the loop to another. We’re trying to eat our cake and have it too.

What we ended up deciding was to split the difference. It’s a little more complex that what you get in C#, for example, but it should give everyone the semantics they expect and not break any existing code (always a plus, in my book). Basically, we are going to treat the lifetime of local variables to be the scope that they are declared in. Thus, when you create a closure in a loop, each iteration of the loop does, indeed, get its own copy of the local variable. But to preserve the existing semantics, we’re going to add a caveat: when creating a new local variable, if a previous version of that local variable exists, we copy the value from that previous version into the newly created local variable. So the value of a local variable declared in a loop carries over from one iteration to the next. So lambdas work the way most people expect, and existing code continues to run as expected.

As with previous entries on closures, kudos to those who’ve bothered to read this far. It’s kind of arcane and, we believe, most people won’t ever have to think at all about the special semantics–things will just work.

8 thoughts on “Lambda expressions, Part II: Closures

  1. Cameron Beccario

    I’m confused about exactly where the "y" in "Function(x) x + y" gets evaluated. Let’s consider a simpler example:

    Dim lambda1 As AddDelegate

    Dim y As Integer

    y = 0

    lambda1 = Function(x) x + y

    y = 1

    Console.WriteLine(lambda1(0))

    Would the output of this code be 0 or 1? I’m assuming from the discussion about closures and variable lifetimes that the output would be 1, by design. Is there any particular reason why the design is to evaluate "y" during lambda _execution_ rather than lamdba _construction_? If "y" were evaluated during lamdba construction, then doesn’t the whole closure lifetime problem go away?

    Reply
  2. Cameron Beccario

    OK, on further thought, the design makes sense, especially if the lambda expression modifies "y" by passing it ByRef to some other function:

    Function Foo(ByRef a As Integer) As Integer

    a += 1

    Return a

    End Function



    y = 0

    lambda1 = Function(x) x + Foo(y)

    y = 1

    Console.WriteLine(lambda1(0))

    Console.WriteLine(y)

    This should print:

    2

    2

    Capturing the value of "y" in a closure during lamdba construction probably defeats the whole purpose of lambda expressions. 🙂

    So, when are closures created? On each Dim or on each entry to a new scope? What would the following do:

    Dim lambdas(1) As AddDelegate

    label:

    Dim i As Integer

    lambdas(i) = Function(x) x + i

    i += 1

    If i <= 1 Then Goto label

    For Each l As AddDelegate In lambdas

    Console.WriteLine(l(0))

    Next

    What would the output be?:

    2

    2

    or:

    1

    2

    Reply
  3. Justin Michel

    Module Module1

    Delegate Function AddDelegate(ByVal x As Integer) As Integer

    Sub Main()

    Dim lambdas(9) As AddDelegate

    Dim lambdas2(9) As AddDelegate

    Dim z As Integer

    For i As Integer = 0 To 9

    Dim y As Integer

    z += 2

    lambdas(i) = Function(x) x + y + z

    y += 1

    z += 2

    lambdas2(i) = Function(x) z += x + y

    y += 1

    Next

    For i As Integer = 0 to 9

    Dim func As AddDelegate = lambdas(i)

    Console.WriteLine(func(0))

    func = lambdas2(i)

    Console.WriteLine(func(0))

    Next

    Console.ReadLine()

    End Sub

    End Module

    * Why change the behavior of y in the loop above? I don’t see why it can’t behave

    exactly as VB8, and retain its value. This is different than other languages, but not

    any less "correct". If anything, it’s weird that other languages create a new instance

    of y just because the loop is starting over. (esp if your mental model for a loop is

    a goto at the end.) It’s as if each iteration of the loop in C-family languages is

    really a function call with its own stack.

    * However, I *would* expect each lambda expression to get its own copy of y, because

    each really *is* a new function call with its own stack.

    * It doesn’t even seem relevant to a variable within a loop, as z above should also get

    its own copy for each lambda expression.

    * I still prefer "lambdas(i) = ByVal(0) + y" syntax to what you’ve settled on. How do

    you handle ByVal and ByRef parameters? What about typed parameters?

    I would say

    "lambdas(i) = ByRef(0 as Integer) + y + z")

    Or maybe a more verbose

    "lambdas(i) = Param(0) + y + z"

    or

    "lambdas(i) = Param(ByRef 0 As Integer) + y + z"?

    * Is it legal for me to assign to z, as I do in the second lambda? It seems like it should

    be, but it would require you to essentially have 2 copies of z available in the lambda.

    So, I would expect lambdas(0)(0) = 0 + 0 + 2 = 2,

    and lambdas2(0)(0) = 4 + 0 + 1 = 5 with z = 5 as a result.

    In other words code that doesn’t assign to z would use a copy of the value of z when

    the expression was created, but assigning to z would still update the variable from

    the containing scope.

    * After reading Cameron’s comments, I’m even more confused about what should

    happen. I was going under the assumption that a closure should occur whenever

    the "Function(x) …" line is executed, giving the new function a snapshot of the

    variables in scope at that time, with the added complications of dealing with updates

    to those variables. So, in his 1st example, I would expect the "lambda1(0)" call to

    use the y=0 value, and then assign y = 0 + 1, which overwrites the existing value

    of y (which was already 1 so it’s a confusing example.). So the output would be:

    1

    1

    But more importantly changing the line "y = 1" in the code to "y = 42" would have

    no effect on the output, because y is overwritten by Foo().

    Does this really defeat the purpose of lambdas and/or closures? I don’t think so, but

    I’m no expert on either.

    For his second example, I would expect the output to be :

    0

    1

    The first lambda expression is a function that adds 0, and the second lambda is a

    function that adds 1.

    * I can see the value in some of these simple lambda expressions, but I’m afraid it’s a

    slippery slope to abuse of the ‘:’ and ‘_’ operators.

    lambdas(i) = Function(x) If z > 10 Then : z = x + y 2 : Else If z > 5 Then : _

    z = x + y : Else If z > 0 then : z = 1 : Else z = 0 : End If

    I think the value of VB has always been the simplicity and clarity of its syntax,

    helped in no small part by the line-oriented nature. Of course, not all VB syntax is

    simple and clear, but that’s a different topic.

    Reply
  4. paulvick

    Cameron: Closures are created upon entry into a block. So your example would print 2 2, because you never left the block (so we never would create another copy of y).

    Justin: What we’re proposing to do is exactly what you suggest in terms of the behavior of closures and loops. The plan for more complex parameters is just to have the lambda syntax work like a nameless function:

    x = Function(x As Integer, y As Integer) x + y

    Note that for now lambda expressions can ONLY have expressions in them. So you can’t do assignment in the context of a lambda (at least, not directly). This is a limitation and we expect in the future to support lambdas with statements, but we won’t do this for Orcas.

    Reply
  5. Branco Medeiros

    These lambda expressions remind me the good old DefFn from MSX-Basic…! =))

    I suppose that the "correct" behavior would be to *have a reference* to y. On the other side, making it possible to have a *copy* of the value would be usefull to. Why not allow ByRef and ByVal in such expressions? You could even state that ByVal would be the default (just as in parameter declaration). This would follow the Rule of Least Surprise ™, I guess =)))

    <example>

    Delegate Function AddExpr(X As Integer) As Integer

    Dim Lambda(0 To 3) As AddEpr

    For Y As Integer = 0 to 3

    Lambda(Y) = Function(X) X + ByRef(Y) + Y

    ‘Or, if you wanted to be explicit:

    ‘Lambda(Y) = Function(X) X + ByRef(Y) + ByVal(Y)

    Next

    For Each F As AddExpr In Lambda

    Console.WriteLine(F(0))

    Next

    </example>

    Since, at invocation time, Y is 4 (because it went out of the loop with this value), the snippet would print 4, 5, 6 and 7.

    Reply
  6. Fan Shi

    I think "Function" keyword is a little bit long for this.

    For example:

    Dim f As Func(Of Integer, Integer) = Function(x, y) x + y

    Could you consider a symbol or short keyword?

    Dim f As Func(Of Integer, Integer) = (x, y) x + y

    Dim f As Func(Of Integer, Integer) = |x, y| x + y

    Dim f As Func(Of Integer, Integer) = (x, y): x + y

    Reply
  7. Andy

    Can you clarify "for now lambda expressions can ONLY have expressions in them. So you can’t do assignment in the context of a lambda (at least, not directly). This is a limitation and we expect in the future to support lambdas with statements, but we won’t do this for Orcas"?

    Does it mean that a lambda expression can not have multiline expressions?

    Looking past Orcas, would it mean that lambda expressions with multiple statements would include ‘Return’ statements and ‘End Function’ keywords?

    Reply
  8. Pingback: Panopticon Central

Leave a Reply

Your email address will not be published. Required fields are marked *