Monthly Archives: October 2004

Movin’ out

You may notice that comments are temporarily turned off on this blog for a day or so. This is because I’m moving the blog out of my basement and onto a hosted ASP.NET service, easerve.com. As much fun as it’s been to run everything myself, the reality that I’m not cut out to be a full-time system administrator has slowly sunk into my consciousness over the past year. That, and my wife really wants a webmail solution for her domain and I really don’t want to spend the time tracking one down, setting it up and maintaining it. So off we go. Once the DNS updates have propagated, you’ll be able to post comments again. Here’s hoping the move is smooth!

Dynamic languages/dynamic environments

The .NET Languages blog recently pointed me to an SD Times article by Larry O’Brien entitled “Dynamic Do-Over.” Most of the later part of the article talked about IronPython and Jim Hugunin, but the earlier part touched on something that I’ve discussed earlier: the question of language strictness when it comes to typing. The more I think about it, the more I believe that static typing is a good thing and something that should be encouraged wherever possible. But when I say that, I don’t mean to say that there isn’t something of value in all those scripty-like dynamic languages out there. I think Larry hits the nail on the head in his article: what makes dynamic languages so great is not their loose type systems, but their dynamic environments.

In the end, I think anything that helps the average programmer be more productive is a good thing. By and large, static typing satisfies this dictum: static typing enables all kinds of programmer productivity features like Intellisense, better error messages at compile time, etc. (One could argue, I suppose, that you could lose the static typing and use type inferencing instead, but I wonder whether it would be possible to build a complete enough type inferencing ruleset that: a) was implementable, b) made some kind of sense, and c) could compete with just stating the damn type of your variables.) Dynamic environments also do this: edit and continue (pace Franz et al.), continuable exceptions, being able to call functions at design time, etc. So I think marrying the two worlds has some facinating possibilities.

I should add, though, that I don’t believe loose typing has no use. One application for loose typing that I’m particularly interested in is modeling unstructured or semi-structured data such as XML. I think the work that the E4X group has been doing is particularly interesting…

Late bound overload resolution and structures

I was just reviewing some notes I made about updating the language specification and I came across this little gem: What do you think the following code does?

Option Strict Off
Structure MyStruct
Public MyField As Integer

Public Sub Mutate(ByVal x As Byte)
MyField = x
End Sub

Public Sub Mutate(ByVal x As Short)
MyField = x
End Sub
End Structure

Module Module1

Sub Main()
Dim o As Object = 5S
Dim s As MyStruct

s.MyField = 10
s.Mutate(o)
MsgBox(s.MyField)
End Sub

End Module

Yes, that’s right, it prints 10. Why? Well, it has to do with how we handle overload resolution and loose-typing.

When we were originally coming up with the rules for overload resolution, we noticed a particular problem with loosely-typed code (that is, code that doesn’t explicitly state the types of its variables). Because all the variables in a loosely-typed program are typed as Object, there are a lot of places where the regular overload resolution rules fall down. Take the example above: if you follow the normal rules of overload resolution, the call to s.Mutate is ambiguous. That’s because the argument type is Object and the two parameter types are Byte and Short. Object has narrowing conversions to both Byte and Short, so we can’t actually choose between them.

However, we thought, at run-time all values become strongly typed. So even though the variable may be statically typed as Object at compile-time, if we defered the overload resolution to run-time, we could probably resolve the overloading correctly. (I should point out again that this only applies to loosely-typed programs: in a strongly typed program, you’d just insert a cast to resolve the ambiguity.) So we added a special “loosely typed overload resolution rule” to handle this situation: if overload resolution produces an ambiguous result solely because of narrowing conversions from Object, then we defer the resolution until run-time. Or, in other words, we make the call late bound. This isn’t such a bad thing because if you’re doing loosely-typed programming, by definition you’re going to be doing a lot of late binding anyway.

OK, so all is well and good, no? Well, not exactly. You see, the problem in the above situation is that to make the call to Mutate late-bound, we have to call some helper functions at run-time. And those helper functions take the target of call as a value typed as Object (since you can late-bind against any type). Which means that we have to cast the target of the call to Object. Which means that, since MyStruct is a structure, we have to box the value. Which means that the target of the call is no longer the stack location indicated by s, but instead some heap location where we boxed the value to. Which means that Mutuate changes the heap location instead of the stack location, and so the change is lost when the method returns.

This is an unfortunate subtle interaction between two features. The good news is that you really only will get into trouble when mixing strongly-typed with loosely-typed code. If, in the example above, you’d typed s as Object and assigned it a new instance of MyStruct, everything would have worked as expected because the structure would always have been boxed. On the other hand, if you turn Option Strict On, the above example will give you an error on the overloaded call to Mutate and will require you to resolve the ambiguity with a cast. So the chance of anyone running into this is thankfully remote.

Be careful what you wish for…

So we’d been bugging Scoble for a while about getting more VB interviews (besides Robert Green’s appearances) up on Channel 9 and about a month ago, he took me up on my offer to interview about VB. He stopped by with his camera and we did our little bit and then he left. Then yesterday I get an email with a bunch of pointers to videos of our interview for me to review and make sure I didn’t say something horribly stupid or anything.

Ug, be careful what you wish for. The experience of watching myself on video is just torture – do I really look like that? I mean, I’m sure it’s fine – my wife and friends seem to think I look and sound OK – but geez. I’ve finally taken to just listening to the audio instead of watching myself and that seems to be better (although still a bit difficult).

Am I just being neurotic? Or does everyone hate to watch themselves on camera? (I guess not everyone, judging by reality shows.)

Performance matters!

I was chatting with some of the people on our performance team (i.e. the people who are supposed to make sure VB is as speedy as possible), and they were talking about some plans they were considering to help gather public feedback on problem areas in the product. “What about the MSDN Product Feedback Center?” I asked. “We hadn’t thought of that,” they replied, “perhaps you should blog about it!” So consider it done. I am now officially encouraging everyone to report any major performance problems you’re having with VB using the MSDN Product Feedback Center.

To ensure that your bug gets the amount of attention due it, here are several suggestions:

  • Put the word “Performance:” in the title of the bug.
  • Include the general hardware specifications (CPU speed, memory, etc.) that you’re testing on.
  • Include any changes to the environment that you’ve made (i.e. what tool windows are showing, which profile you are using, etc.).

And, most importantly:

  • Include specific steps that will allow us to easily reproduce the problem.

The last point is important to ensure that we actually can do something in regards to your bug report. Saying “the product is slow,” isn’t going to help us. Saying “when I show the Autos window and then hit the Step Through toolbar button on this code, it takes several seconds.” is more like it. Remember, the more specific you are, the higher chance we’ll be able to fix the problem.

I’m also talking to Robert Green to see if we can’t get a specific “Performance” Problem Type added to the Feedback Center to help categorization. The wheels grind slowly, though, so it may be a while on that one.

My First (and Last) Programming Language?

Somewhere along the line, I started reading Matt Jadud’s weblog because he talks about pedagogical issues with teaching computer science and I’m interested in this subject both from a personal and a professional standpoint. From the personal side, I spent several summers as a teaching assistant for Computer Science classes at the Talent Identification Program at Duke University and actually taught the introductory class one summer there. (TIP is a program that identifies bright teenagers using the SAT and gives them the chance to take advanced classes over the summer. It’s similar to CTY at Johns Hopkins.) Besides having fun teaching, the experience was also a fascinating problem to solve: how do you teach programming to non-programmers? Having picked up programming at an early age, it was a challenge to try and work out how to impart something that, for me, was very natural. It also helped that I assisted some very good teachers. From the professional side of things, I’m interested in how people learn to program because VB is explicitly designed to be an approachable language for beginners. So how close we do or don’t approach that ideal (and you can certainly argue that with me, if you like) is a big question for us.

A recent entry on Matt’s weblog referred back to an entry he wrote last year called “My First Programming Language(TM)”. Since I only subscribed in the past six months or so, I hadn’t seen it before but I’m glad he linked back to it because it had some very interesting thoughts about how programming languages should be structured for beginners. In particular, he talks about applying Joel Spolsky’s idea of leaky abstractions to programming languages by way of two generic languages named A and B:

In Language A, it turns out that you can’t say anything meaningful without reaching for other parts of the language. While we want to start off with simple bits of code, we must include bits and pieces of the full language to make even the simplest of sentences. Language B, however, is a complete core. We can do simple things simply to start, and if we want to say more complex things, we can. As we grow the language, we don’t have references out into parts of the language we don’t know; instead, we only have references back to the parts of the language we have already learned.

The inability to say simple things simply in Language A is a leaky abstraction. The inability to keep the complexity of the whole language under the covers kills us pedagogically (the blue arrow, below). What should we teach first? Where do we start to build a shared conceptual vocabulary that both the instructor and student can use? Where do we go next if we do manage to find a starting point?

What’s great about his discussion of these two languages is that it captures some of how we think about Visual Basic in a way that I don’t think we (or, at least, I) have been able to verbalize before. When designing new language features, we spend a lot of our time thinking about leaky abstractions and how to avoid them – without calling it exactly that. A good example of this is our design for partial types, which I’ve discussed here earlier. Partial types solve a particular problem for us: they allow us to abstract away automatically generated or templated code from user code. When designing the feature, we had to decide between two designs: one design required the abstraction to leak through into the user code (i.e. explicitly stating “Partial” on all parts of a partial type) and one design kept the abstraction separate at the cost of some explicitness (i.e. not having to specify “Partial” on all parts of a partial type). Each design had it’s advantages and disadvantages, but ultimately we came down on the side of the latter design because we felt it was more important to contain the abstraction, even at the sacrifice of some explicitness.

This does raise a deeper point which Matt references but doesn’t explicitly address. His entire discussion is organized around a split between a “Language A,” which is supposed to be leaky but suitable for “enterprise software,” and a “Language B,” which is supposed to not be leaky but suitable mainly for pedagogy (and not, by implication, enterprise software). The question is: why does there have to be this dichotomy? Why can’t there be a Language C that is designed to be both watertight in terms of leaky abstractions and suitable for writing “real” software? Indeed, our thesis is that Visual Basic does satisfy most of the traits of Language C, but we still end up spending a lot of time having the Language A/Language B argument anyway. Sometimes it’s like the old saying: “There are two kinds of people in the world, those who believe there are two kinds of people in the world and those who don’t.”