Author Archives: paulvick

The Secret to Understanding C++ (and why we teach C++ the wrong way)

A little over a year ago, I asked “How on earth do normal people learn C++?” which reflected some of my frustration as I re-engaged with the language and tried to make sense of what “modern” C++ had become. Over time, I’ve found that as I’ve become more and more familiar with the language again, things have begun to make more sense. Then a couple of days ago I went to a talk by Bjarne Stroustrup (whose name, apparently, I have no hope of ever pronouncing correctly) and the secret of understanding C++ suddenly crystallized in my mind.

I have to say, I found the talk quite interesting which was a huge accomplishment because: a) I usually don’t like sitting listening to talks under the best of circumstances, and b) he was basically covering the “whys and wherefores” of C++, which is something I’m already fairly familiar with. However, in listening to the designer of C++ talk about his language, I was struck by a realization: the secret to understanding C++ is to think like the machine, not like the programmer.

You see, the fundamental mistake that most teachers make when teaching C++ to human beings is to teach C++ like they teach other programming languages. But with the exception of C, most modern programming languages are designed around hiding as many details of how things actually happen on the machine as possible. They’re designed to allow humans to explain to the computer what they want to do in a way that’s as close to the way humans think (or, at least, how engineers think) as possible. And since C++ superficially looks like some of those languages, teachers just apply the same methodology. Hence, when you learn about classes, teachers tend to spend most of their time on the conceptual level, talking about inheritance and encapsulation and such.

But the way Bjarne talks about C++, it’s clear that everything that C++ does is designed while thinking hard about the question how will this translate to the machine level? This may be a completely obvious point for a language whose main purpose in life is a systems programming language, but I don’t think I’d ever really groked how deeply that idea is baked in to C++. And once I really looked at things that way, things make a lot more sense. Instead of teaching classes at just the conceptual level, you really need to teach classes at the implementation level for C++ to make sense.

For example, I’ve never been in a programming class that discussed how C++ classes are actually implemented at runtime using Vtables and such. Instead, I had to learn all that on my own by implementing a programming language on the Common Language Runtime. The CLR hides a lot of the nitty-gritty of implementing inheritance from the C# and VB programmer, but the language implementer has to understand it at a fairly deep level to make sure they handle cross-language interop correctly. As such, I find myself continually falling back on my CLR experience when looking at C++ features and thinking, “How is this supposed to work?” I can’t imagine how people who haven’t had to confront these kinds of implementation-level details figure it out.

It makes me wonder if a proper C++ programming course would actually work in the opposite direction of how most classes (that I’ve seen) do it. Instead of starting at the conceptual level, start at the machine level. Here is a machine: CPU, registers, memory. Here’s how the basic C++ expression map to them. Here’s how basic C++ structures map to them. Here’s how you use those to build C++ classes and inheritance. And so on. By the time you got to move semantics vs. copy semantics, people might actually understand what you’re talking about.

You should also follow me on Twitter here.

Trust, Users, Developer Division, and the Missing Link

Mary Jo Foley pointed today to a very interesting post by David Sobeski entitled Trust, Users and the Developer Division. Written from the perspective of a guy who was in the Windows division through the climactic shift to .NET, he brings up a lot of really good criticisms about exactly what went on during that time. (In fact, I was just wondering last night whether it’s time yet to sit down and write the “what I learned in the VB to VB.NET transition” blog post that I’ve been putting off for a long time.) But there was one really crucial point that he missed that I think made one of his central theses–that DevDiv managers went all Colonel Kurtz on developers and decided with .NET they were going to go build their own platform–a little on the unfair side. Somehow he, and most of the comments I’ve seen so far, have managed to overlook the monster that was striking fear into the hearts of the Microsoft management chain back then. The monster that was going to come along and destroy everything they’d worked so hard to build. Can no one remember? Has it really been so long since Java was relevant?

Yes, before Android and iOS and so on there was Java, and at the time it really scared the hell out of a lot of important and powerful people at Microsoft. Here was a programming layer that was supposedly going to: a) be free, b) be cross platform, c) not be controlled by Microsoft, and d) abstract away the underlying operating system. I mean, now we all can look back and see exactly how the dream of “write once, run anywhere” worked out (pace HTML5), but at the time Java looked like the Trojan Horse that was going to slip into Windows via the browser and hollow out the Windows ecosystem from the inside. In just a few years Scott McNeely was going to be standing over the corpse of Windows laughing in Bill Gates face. I was just a lowly functionary on the VB team at the time (OLE Automation, FTW!) but since I had some fingers in the VB runtime pie I got included on some email chains from some very high level people (not just in DevDiv!)talking about just how worried they were about what Java might do to Windows and how it might allow Sun to wrest control of the app market away from Microsoft.

(One interesting angle, too, about this was that since the Developer Division had a Java compiler project, they also got to see firsthand just how much better development could be on a modern runtime as opposed to the creaky, old, inconsistent Win32/COM API set. The VB team lost a bunch of very bright developers who jumped at the chance to work on something as enjoyable and freeing as Java as opposed to the jumbled mess Windows programming had become. I think this added extra wind in the sails of the managers who weren’t in denial about the fact that as much as developers were tied to the Windows ecosystem, they could still be pried away if you gave them a much better way to program.)

So, anyway, this is the big piece of missing context when looking at the motivation behind .NET. The .NET initiative was, fundamentally, a way to answer the threat that Java posed to Windows, not just some way for a bunch of DevDiv middle managers to go play OS designer (although they did seem to enjoy doing that just a little too much at times). And it’s not surprising that this context would be missing from the worldview of someone who worked in Windows. I never got the impression that Java scared the Windows guys half as much as it scared the DevDiv guys because at that time Java was still mostly a language and a language runtime. It was something that we understood in a much more visceral way than they did. One can certainly argue that that the Windows guys were right not to worry about Java, but hindsight is 20/20. And, based on what’s happened since then, maybe the Windows guys maybe should have been a bit more worried about someone coming along and stealing away their position at the top of the app development heap. Just sayin’.

All that being said, I want to emphasize that I agree with much of the criticism that he levels at Microsoft’s and DevDiv’s approach to the question of trust. But I think that’s another post for another time.

You should also follow me on Twitter here.

Codename Nostalgia, or Go Speed Racer Go!

One of the great things to do in my neighborhood is to head over to Central Cinema, a small movie theater that serves food and drinks, and which has decidedly eclectic tastes. The best thing they do, speaking as a parent, is their “Cartoon Happy Hour” every Thursday where they show a wide range of cartoons for kids (and parents) to enjoy. It’s been a great opportunity to introduce my kids to cartoons that I watched as a kid (Hong Kong Phooey! Tom and Jerry! Looney Toons!) that they might not otherwise get to see.

Recently they’ve shown several episodes of Speed Racer, a cartoon that my mom would never let me watch because it was “too violent.” (Compared to modern cartoons, it’s positively milquetoast.) So it was fun to finally watch it with the kid, but when the theme song came on it triggered a nostalgia of an entirely different sort.

It reminded me of that day, long ago, when I actually got to pick the codename for a product. Looking back now, I’m not exactly sure why I was so keen to pick a codename–it’s not like it’s prestigious or anything–but I definitely was. I wanted to set the name for a whole product cycle, really wanted to. And when the cycle for Access ’97 came along, I got my chance, for two reasons. The first was that most of the team had split off to do a re-write of Access that was supposed to ship after Access ’95 but then got pushed back to after Access ’97 (and then cancelled), so pretty much all of the senior people were no longer on my team. And because Access ’97 was supposed to be a placeholder until the rewrite was finished, it was decided that the bulk of the work on the release was going to be solving the major performance issues we incurred moving from 16-bit Windows 3.1 to 32-bit Windows 95/NT.

Since I was heading up the performance work, I saw my chance and pushed to pick the codename. Of course, picking a codename wasn’t that easy–what would be cool name that would go with the idea of speed? That’s when Speed Racer flashed in my mind, and so the codename for the release became “Mach5,” named after Speed Racer’s car. In the pre-YouTube days, I even got a VHS tape of a Speed Racer episode and went over to Microsoft Studios to get them to convert the theme song to an AVI so I could send it around to the team. (Boy, was I a bit of a geek.) Mission accomplished.

Now, of course, you could never pick a codename like that. In our litigious society, people sue over codenames. I actually saw this in action–there was an old internal project at Microsoft that borrowed its name from a well-known company that makes little plastic bricks. Even though the tool was completely internal and never used or discussed publicly, the codename somehow ended up on some stray public document. The company in question was alerted to the use of the name, the lawyers got involved (so I heard), and in short order the tool was renamed to an innocuous acronym.

So… place names and colors it is, then!

You should also follow me on Twitter here.

Writing a Compiler in Haskell

In the useless memories department, xkcd’s cartoon today took me back:

In my junior year of college, I took the standard programming languages course that goes over fun stuff like how programming languages are put together. The main project for the class, of course, is to build a compiler for a small language. The twist was, that the professor for this class happened to be one of the designers of the Haskell language, so you can guess what programming language the compilers had to be written in.

The thing is, this class took place probably in the fall of 1990 or the spring of 1991. According to Wikipedia, Haskell debuted in 1990, so first and foremost the tools we were working with were… uh, primitive at best. I think the Haskell interpreter was written in Common LISP, and basically you took your program, invoked the interpreter on it, went and got yourself a nice cup of tea (so to speak), and then came back to an answer (if you were lucky), an undecipherable error message (if you were only kind of unlucky), or just nothing (most of the time). Definitely honed the skill of “psychic debugging.”

Anyway, with a brand-new language (that, of course, had no manuals or books written about it yet) and an alpha-level (at best) compiler, as I remember it at least half the class never even managed to get something working. I somehow managed to grok enough of Haskell to be able to write a functioning compiler that took our toy language in and produced correct output. I was one of the lucky ones. But here’s the thing…

It was, without a doubt, one of the most beautiful programs that I’ve ever written. Just a real work of art, with the data flowing through the code in one of the most natural ways I’ve ever seen. Just awesome. It’s the one piece of code that I look back on and wish that I still had a copy of.

So I’ve always had a soft spot in my heart for Haskell. Even if nobody else could actually understand what it was doing or why.

You should also follow me on Twitter here.

JsRT Sample Bug Fixed

This happened a while ago but just getting to it now because of my blog hiccup. A sharp-eyed reader pointed out a problem with my C# and VB Chakra hosting samples (on MSDN and on GitHub). When passing a host callback from managed code, I forgot to hold on to a managed reference to the delegate I was passing out to Chakra. So if the CLR ran a GC, it would dispose the delegate and then a callback on that delegate would jump into hyperspace. The samples are now correct. Ah, the joys of coordinating GCs.

You should also follow me on Twitter here.

Sorry for the blog hiccup…

If anyone has been trying to leave comments or send me a message or anything, I’m afraid my blog has been on the fritz for… a while. Not sure exactly how long. Apparently my hosting tier with Azure give me a 20Mb limit on my MySQL database for this blog and when you hit it, it just starts failing your INSERT and UPDATE queries. So it wasn’t actually clear anything was wrong for quite a while and then it wasn’t clear WHAT was wrong. Thank goodness for the wisdom of the Internet, that’s all I have to say. Anyway, should be back up and running now, thanks.

You should also follow me on Twitter here.

Memory profiling in Visual Studio 2013 with JsRT

One of the cool features of Visual Studio (starting in Visual Studio 2012 Update 1) is its JavaScript memory profiler. The profiler can take snapshots of the JavaScript heap and then gives you a nice UI you can use to drill in to those snapshots and compare them as you like:

Comparing two snapshots

Drilling in to a snapshot

Wouldn’t it be great to be able to utilize this to analyze the JavaScript memory usage in an app that hosts Chakra using JsRT? Well, you can! I’ve uploaded a new sample (and checked it in to my GitHub repo) that allows a JsRT app to generate heap dumps in a way that Visual Studio 2013 will be able to read. It doesn’t look quite as beautiful at the examples above–I didn’t go through the trouble of generating a screenshot for the summary page–but it should be entirely functional.

The sample project builds a C++ DLL that exports five functions:

  • bool InitializeMemoryProfileWriter() initializes the overall profile writer and should be called when you load the DLL.
  • MemoryProfileHandle StartMemoryProfile() starts a specific profile (called a “diagnostic session” in Visual Studio).
  • bool WriteSnapshot(MemoryProfileHandle memoryProfileHandle, IActiveScriptProfilerHeapEnum *enumerator) takes a memory profile handle and a heap enumerator (which can be obtained from JsEnumerateHeap) and writes the actual memory snapshot. Note that you can do this multiple times before finishing the memory profile. Each snapshot is written separately into the diagnostic session.
  • bool EndMemoryProfile(MemoryProfileHandle memoryProfileHandle, const wchar_t *filename) finishes the memory profile and writes it out to a file. The file should have a .diagsession extension if you want it to be natively recognized by VS when you open it. After this call, the memory profile handle is now invalid.
  • void ReleaseMemoryProfileWriter() releases the overall profiler writer and should be called when you’re shutting down or done with all profiling.

One thing to note is that once you’ve called JsEnumerateHeap for a particular JsRT runtime, you won’t be able to do anything in that runtime until the IActiveScriptProfilerHeapEnum pointer has been released.

We’ve already been using this code internally to look at the memory usage of some internal JsRT hosts, and it’s been very helpful! Let me know if there are problems or issues, and feel free to grab me on Twitter if you have any questions.

You should also follow me on Twitter here.

Thinking of rewriting your codebase? Don’t.

A friendly word of advice: if you’re thinking of rewriting your codebase… don’t. Just don’t. Please. Really.

Yes, I know, your codebase as it exists today is a steaming pile of crap. Yes, I know, that you had no idea what you were doing when you first wrote it. Or that it was written by idiots who have long since moved on. Or that the world has changed and whatever style of development that was in vogue at the time is now antiquated. Yes, I know that you’re tired of dealing with the same old architectural limitations over and over again. And that the programming language you wrote it in is no longer fashionable. And that there are lots of new features in the newer versions of the language that would allow you to express things so much more elegantly. Or maybe, god help me, you think that you can write it more modularly this time and that will allow you to more quickly iterate on new features.

Whatever you think, let me make a bold pronouncement: YOU ARE WRONG. WRONG, WRONG, WRONG. FOR THE LOVE OF GOD STOP WHAT YOU ARE DOING RIGHT NOW AND PUT DOWN THE KEYBOARD BEFORE YOU HURT ANYONE.

OK, now that that’s out of my system, let’s get to the natural question: why? Why am I saying this? Because if you ignore my advice and plunge ahead anyway, you’re going to run into what I modestly call:

Vick’s Law of Rewrites: The cost of rewriting any substantial codebase while preserving compatibility is proportional to the time it took to write that codebase in the first place.

In other words, if it took you one year to write your codebase, it’s going to take you on the order of one year to rewrite that codebase. But, of course, we’re not talking about codebases that are only around one year old are we? No, we aren’t. Instead, people usually start talking about rewrites about around the 5-7 year mark. Which means that, as per the law above, it’s going to take on the order of 5-7 years to rewrite that codebase and preserve any semblance of compatibility. Which is not what most people who embark on large rewrites usually think. They think that they can do it in substantially less time than it took in the first place. And they’re always wrong, in my experience.

I first came to this law way back when I worked on Access and the leaders of the project (who, I realize now, were still laughably young, but seemed very old and wise to me at the time) started talking about rewriting Access after we shipped version 2.0 (development time: approx. 4 years total at that point). At the time, I owned the grid control that was the underpinning of the Access data sheet and other controls, and let me tell you–that piece of code was a bit of a beast. Lots of corner cases to make the UI work “just so.” I was talking to one of the leads about this and he dismissed me with a proverbial wave of the hand: “Oh, no, since we understand how that code works now, we can rewrite it in three months.” I think it was at that moment I knew that the rewrite was doomed, although it took three more years for the team to get to that realization.

The bottom line is that the cost of most substantial codebases comes not from the big thinking stuff, but from details, details, details and from bugs, bugs, bugs. And every time you write new code–even if you totally understand what it’s supposed to be doing–you’ve still got the details and bugs to deal with. And, of course, it’s almost entirely unlikely that you do totally understand what it’s supposed to be doing. One of the things that continually humbled me when I worked on Visual Basic was how we’d be discussing some finer point of, say, overload resolution, and someone would point out some rule that I didn’t remember. No way, I’d say, that’s not a rule. Then I’d go check the language specification which I wrote, and there it would be, in black and white. I had totally forgotten it even though I had written it down myself. And the truth is, there are a million things like this that you will inevitably miss when you rewrite, and then you will have to spend time fixing up. And don’t even get me started on the bugs. Plus, you’re probably rewriting on a new platform (because who wants to write on that old, antiquated platform you were writing on before?), and now you’ve got to relearn all the tricks you knew about how to work with the old platform but which don’t work on the new one.

In my experience, there are three ways around this rule:

  1. You’re working on a small or very young codebase. OK, fine. In that case it’s perfectly OK to rewrite. But that’s not really what I’m talking about here.
  2. Decide that you don’t care so much about preserving compatibility or existing functionality. Downside: Don’t expect your users to be particularly pleased, to put it mildly (see: Visual Basic .NET).
  3. Adopt a refactor rather than rewrite strategy. That is, instead of rewriting the whole codebase at once, take staged steps towards the long-term goal, either rewriting one component at a time, or refactoring one aspect of the codebase at a time. Then stabilizing, fixing problems, etc. Downside: Not as sexy or satisfying as chucking the whole thing out and “starting clean.” Requires actually acquiring a full understanding of your existing codebase (which you probably don’t have). Plus, this will take a lot longer than even rewriting, but at least you always have a compatible, working product at every stage of the game.

Rewriting can do enormous damage to a product, because you typically end up freezing the development of your product until the rewrite is done. Maybe at best you leave behind a skeleton crew to pump out a few minor features while the bulk of the team (esp. your star coders) works on the fun stuff. Regardless, this means your product stagnates while the rest of the world moves on. And if, god forbid, the re-write fails, then you run the risk of your team moving on and leaving the skeleton crew as the development team. I’ve seen this happen to products that, arguably, never really recover from it.

So please, learn to live with that crappy codebase you’ve got and tend to it wisely. The product you save may be your own.

You should also follow me on Twitter here.