Category Archives: Language Design

How on earth do normal people learn C++?

I’m asking this seriously. How?

One dirty little secret around Microsoft is how little “real” C++ code is written around here. I think a lot of it has to do with the fact that when C++ was first coming into it’s own there were a number of high profile projects that enthusiastically adopted C++ but ran into a lot of problems. Some of those problems had to do with the immaturity of the tools (produced by Microsoft), some of them had to do with a lack of experience with C++ (since, of course, it was relatively new at the time), and some of them had to do with the “when you have a hammer…” effect. The end result is that although a lot of projects I’ve worked on have had “.cpp” file extensions on their source code, in many cases you could rename the extension to “.c” and it would compile with very few changes.

Lately, though, I’ve been having to do a lot more modern-ish C++ programming, and while I have an amazed respect for the sheer amount of power available to the C++ programmer nowadays, I am also baffled how anyone can actually understand what the hell they are doing half the time. Don’t get me wrong, I feel like I get by pretty well in C++, but then I spent a decade designing a OOP language, building a compiler for it, and debugging the whole thing. There are lots of times when I’ve been able to survive in C++ only because I can fall back on a mental type system model that was built up through a lot of blood, sweat, and tears. I wonder how people who haven’t had that particular experience actually grok a lot (sometimes, any) of what C++ does.

I’m guessing people just throw a lot of code at the wall until something sticks, or maybe just copy and paste from Stack Overflow. I don’t know. But, seriously, sometimes I think you should have a license before you should be allowed to attempt C++.

Either that, or I’m just being dense. Sadly, that’s always a possibility.

The Use/Build Fallacy

Working in the language space, especially in language design, you frequently encounter people who fall victim to what I call the “Use/Build Fallacy.” It goes something like this:

Because I know how to use something, I know how to build it as well.

This fallacy is best illustrated by a story I heard from a friend who’s a teacher (another profession that frequently has to deal with this). She was teaching middle-school when teacher conferences rolled around. Talking to the father of one of her students, she explained to him that his daughter was having a lot of trouble in English class and that, based on her observations of how hard the daughter was working, she was pretty sure that the daughter had some sort of language learning disability. She therefore strongly recommended that he take his daughter to an expert to get tested, and that she be tutored by someone trained to deal with the specific kind of learning disability. The father was nonplussed, mainly because he didn’t like the idea that all this would cost him money. “Can’t you just help her more in class?” he asked. My friend explained that she was helping her all she could, but she wasn’t an expert in diagnosing learning disabilities and his daughter really needed to see someone who had the appropriate training.

After a bit of back-and-forth, the father finally got exasperated and said, “Fine, I’ll just tutor her myself! I mean, how hard could it be? I went to school!” My friend then shot back, “Look, you’re a general contractor, right? What would you think if I came to you and said, ‘I don’t need you to build my house—I’ve lived in a house before, so how hard could it be to build one myself?” This, finally, stumped him. I’m not sure whether he actually got his daughter the help she needed, but the story stuck with me because my friend’s response is the perfect distillation of the Use/Build Fallacy.

Note that I’m not saying that just because you’re not an expert on something you can’t have an opinion. I may not know how to build a house, but that doesn’t mean I have nothing to say to the contractors if I decide to do some renovations on my house. Not falling prey to the fallacy, though, means that I always keep a healthy respect for the expert in a field—as long as they truly seem to know what they’re talking about. (I hear this from friends who are architects all the time—they get hired by someone to build or renovate a house for them, and then their client spends all their time endlessly arguing with everything they do. Why bother to hire an expert if you think you already know how to do it yourself?)

I try to remember this myself every time I encounter some aspect of some programming language that I don’t like. Right now, I’m neck-deep in C++ code and it’s tempting to spend all my time kvetching about how how horrible a job Bjarne has done over the years. And then I try to remember—even as someone who’s actually built a language—that this stuff is hard. A language of any complexity has a huge number of moving parts, all of which interact with each other in an unpredictable manner. Historical choices can come back to bite you in all sorts of unexpected ways. Oftentimes all you have are a bunch of imperfect choices, and you have to simply pick the least bad of them all. And then you get to sit there and listen to everyone on the sidelines complain about how horrible a job you’ve done and how they could do it so much better than you because, hey, they’ve used a programming language before.

So I try to temper my complaints with a little humility, and remember how much different building is from using.

Why every language needs a language specification…

One of the things that I discovered when I started working on SQL Server is that T-SQL, like VB prior to .NET, has no language specification. This continues to mystify me—how language teams get away without having a language specification for so long. I should probably back up for a moment and explain what I mean when I say “language specification.” I mean a document that:

  • Is public.
  • Is kept up to date.
  • Describes the entire language (syntax, semantics, type system, etc.).
  • Is produced by the team that produces the language itself.

An initial spec that was allowed to lapse doesn’t count, for obvious reasons. Books written by people outside of the team/company don’t count because there’s no way someone who doesn’t live and breath the language every single day can possibly hope to shoot for completeness. Documents that just describe syntax or sort-of describe the language (SQL Server’s Books Online falls into this category) don’t count because often what you need is a holistic description of the language, and for that you need, well, the whole language described. And documents that aren’t public don’t count because what good is a document that no one reads? (Not that many people read language specifications anyway, unless the author’s name happens to be Hejlsberg or Gosling.)

I think the reason that language specifications often don’t get written is that they are a huge amount of work and an incredible pain in the ass. Writing the Visual Basic Language Specification was no mean feat and consumed a whole lot of time that I could have productively spent elsewhere. But I do think that every programming language needs one, for a variety of reasons:

  1. Languages are not algorithms. Although many programs of some sort or another could use a good specifications written down, the reality is that many don’t need one because the code is the specification. For example, I can imagine that there might be some specification written down about how the Excel recalc engine works, but it’s probably not that useful because, well, you could just go look at that code itself if you want to know how the recalc engine works. Most pieces of code can be more easily understood by tracing through the code than by reading a description of them. Languages, however, are a special case—they are an emergent property of the compiler, and although there are various algorithms that contribute to a language (say, the binding rules), much of the most important parts of the language arise from the interplay of the various algorithms. Thus, just having the code often is not really enough to understand what a language is or does.
  2. Specifications keep you honest. And honesty is important when it comes to programming languages. You can cheat like hell when it comes to user interfaces, and most of the time you can get away with it—extraneous menu items or options or notifications or other junk may clutter the UI up a bit but by and large you can deal with it, and you can always “clean up” a UI without too much trouble. You can cheat less with libraries and APIs, but even there you can always come up with a new version of the interface or library and gradually move people. With a language, however, it’s nearly impossible to fix something once it’s in the language. SQL Server’s own deprecation process takes three major releases, which means that a mistake you make in the language will take well over a decade to get rectified, if at all. Having a central language specification helps the team monitor what’s really going in the language and helps keep the team honest about what they’re doing.
  3. Having to explain things forces you to actually try and understand what your language does. I was continually amazed when writing the Visual Basic language specification how superficially I actually understood some features until I tried to write them down and explain them. Features that had been extensively discussed in design meetings and were already prototyped, even. In this way, it’s somewhat analogous to coding—how often I think I understand the solution to a problem until I sit down to write the code and realize how foolish I have been to think I really understood the problem!
  4. You need some kind of institutional memory. It’s all well and good to rely on the one guy who knows everything about everything in the language, but what happens when they retire/quit/move on/get hit by a bus? Now all you’re left with is a mass of language rules and often no idea why things were done one particular way or another. This is not an uncommon problem in general with programming, but it’s even more acute with language design. When I was a young, naïve language designer, there were a number of times when I looked at existing languages and the choices they made and thought to myself, “Boy, is that a dumb design decision.” Then, a couple of years down the road, having ignored some of the wisdom of those who came before me and having run into a brick wall at full speed, I would think, “Oh, I see, that’s why they did that…”

Anyway, I’m not planning on just complaining about this and not doing anything about it. Stay tuned…

You should also follow me on Twitter here.