So my wife’s been working out with a personal trainer at the health club we belong to through work. Yesterday, we get a call from the trainer saying that she can’t meet with Andy because we cancelled Andy’s membership with the health club, which is odd because we hadn’t. So I called the health club this morning and they told me that they cancelled my wife’s membership because Microsoft told me that I was cancelling my membership! So I called the Microsoft benefits line and they informed me that they cancelled my health club membership because the HR system says that I’m resigning from Microsoft this Friday!
Is there some memo I didn’t get?
I’m sure that some wires got crossed somewhere between me and Paul Vijijlkj or whoever is quitting on Friday and it’ll all be straightened out in short order. But it’s just another Brazil-esque reminder of how technology can take on a life of it’s own. Glad I figured this out through the health club ahead of time and not when my next paycheck didn’t show up or my cardkey stopped working…
Roy points to Philip’s complaint that VB still exhibits problems with multi-language solutions that have been around since the VS 2002 beta. Philip’s completely correct, and the explanation of why this bug still hasn’t been fixed even though we’ve known about it since before VS 2002 shipped bears some explanation. Specifically, the problem is with a mistake we made when designing our background compilation system a very long time ago. Since I’ve been asked more than a few times about how background compilation works, this is an excellent chance to delve into that subject. So let me talk about background compilation for a while and then we’ll get back to Philip’s bug.
“Background compilation” is the feature in VB that gives you a complete set of errors as you type. People who move back and forth between VB and C# notice this, but VB-only developers may not realize that other languages such as C# don’t always give you 100% accurate Intellisense and don’t always give you all of the errors that exist in your code. This is because their Intellisense engines are separate, scaled-down compilers that don’t do full compilation in the background. VB, on the other hand, compiles your entire project from start to finish as Visual Studio sits idle, allowing us to immediately populate the task list with completely accurate errors and allowing us to give you completely accurate Intellisense. As Martha would say, it’s a good thing.
However, doing background compilation is a tricky prospect. The problem is that just as soon as you’ve finished compiling the project in the background, the user is likely to do something annoying like edit their code. Once they do that, the application you just finished compiling is now incorrect – it doesn’t reflect the current state of the user’s code anymore. So, the question is: how do you handle that? The brute force way would be to throw away the entire result of the compilation and start over again. However, since Intellisense depends on compilation being mostly complete, this is impractical – given a reasonably large project, you may never get the chance to give Intellisense because by the time you’re almost done recompiling the whole project, the user has had the chance to type in another line of code, thus invaliding all the work you just did. You’ll never catch up.
To deal with this, we implement a concept we call “partial decompilation.” When a user makes an edit, instead of throwing the entire compilation state away, we figure out the smallest amount of stuff we can throw away and then keep everything else. Since most edits don’t actually affect the project as a whole, this means we can usually throw out minimal information and get back to being fully compiled pretty quickly. Here’s how we do it: each file in the project is considered to be in one of the following states at any one time:
NoState: We’ve done nothing with the file.
Declared: We’ve built symbols for the declarations in the file, but we haven’t bound references to other types yet.
Bound: We’ve bound all references to types.
Compiled: We’ve emitted IL for all the properties and methods in the file.
When a project is compiled, all the files in the project are brought up to each successive state. (In other words, we have to have gotten all files to Declared before we can bring any file up to Bound, because we need to have symbols for all the declarations in hand before we can bind type references.) When all the files have reached Compiled, then the project is fully compiled.
Now let’s say that a user walks up to a project that’s reached Compiled state and makes an edit to a file. The first thing that we have to do is classify the kind of edit that the user made. (Keep in mind that “an edit” can actually be an extremely complex one if the user chose to cut and paste one block of code over another block of code.) Edits can generally be broken down into two classifications:
Method-level edits, i.e. edits that occurs within a method or a property accessor. These are the most common and also the easiest to deal with because a method-level edit can never affect anything outside of the method itself.
Declaration-level edits, i.e. edits that occur in the declaration of a type or type member (method, property, field, etc). These are less common and can affect anyone who references them or might reference them anywhere in the project.
When an edit comes through, it’s first classified. If it’s a method-level edit, then the file that the edit took place in is decompiled to Bound. This involves the relatively small work of throwing away all the IL for the properties and methods defined in the file. Then we can just recompile all the methods and we’re back to being fully compiled. Not a lot of work. Say, though, that the edit is a declaration-level edit. Now, we have to do some more work.
Earlier, when we were bringing files up to Bound state, we kept track of all the intra-file dependencies caused by the binding process. So if a file a.vb contained a reference to a class in b.vb, we recorded a dependency from a.vb to b.vb. When we go to decompile a file that’s had a declaration edit, we call into the dependency manager to determine what files depend on the edited file. We then decompile the edited file all the way down to NoState, because we have to rebuild symbols for the file. Then we go through and decompile all the files that depend on the edited file down to Declared, because those files now have to rebind all their name references in case something changed (for example, maybe the class the file depended on got removed). This is a bit more work, but in most cases the number of files being decompiled is limited and we’re still doing a lot less work than doing a full recompile.
This is kind of the high-level overview of how it works – there are lots of little details that I’ve glossed over, and the process is quite a bit more complex than this, but you get the idea. I’m going to stop here for the moment and pick up the thread again in a few days, because there’s a few more pieces of the puzzle that we have to put into place before we get to explaining the bug.
Tom rebuts Jeremy Mazner’s lament for the disappearance of easter eggs. Ultimately, I think most easter eggs are the equivalent of stories about college exploits: they’re only interesting only to the people who were involved and deathly boring to everyone else. Sure, there is the occasionally clever or humorous easter egg, but most serve no purpose to anybody except as a little ego trip.
I say this knowing full well that I wrote several easter eggs for Access before the prohibition on easter eggs went in. I even wrote one that I thought was somewhat clever: a Magic Eight Ball easter egg. The problem was, I left the team and within several versions the Magic Eight Ball had turned into the Crashing Magic Eight Ball.
I don’t think losing easter eggs is a great loss, personally… (Although those Excel guys were always prettydamnimpressive.)
Geek Noise pointed to a tirade against the Cult of Performance that brought to mind a criticism that Jet had of my 10 Rules of Performance entry. The points raised are very well taken – it can actually be more damaging to obsess about performance prematurely than to obsess about it too late. The point that I’m arguing for is moderation. For some reason, developers like to live at extremes – either they’re this way or that, never in the middle. Either they never think about performance or they are completely obsessed with it. Instead, I’m arguing that performance should be a part of the development process and the thought process, but not the only consideration. (If it was, most applications would never ship.)
I suppose this is all human nature, if you look at the way that people tend to polarize in other areas. The title for this entry comes from a speech Polonius gives his son Laertes in Hamlet in which he’s purporting to give him some life advice. Since Polonius is sort of a doddering old windbag, most of the speech boils down to useless platitudes along the lines of be smart, but not too smart; be nice, but not too nice; be rich, but not too rich. However, he does end with a good bit of advice that, nontheless, has got to be the most difficult to follow:
This above all: to thine own self be true, And it must follow, as the night the day, Thou canst not then be false to any man.
If you think about it, a lot of my performance advice boils down to a variation on this theme. Most of what I argue for is to stop and take the time to understand the true nature of your application and work with that. Obsessing about the performance of code you haven’t even written yet doesn’t fall into that category…
…nothing arrives for a long time and then a bunch of them show up together. (Apologies to Alan Moore for the corrupted quote.)
I was having a nice, leisurely week getting caught up but couldn’t think of anything inspiring to write about to save my life. Then I go and get very ill (ending up briefly in the ER to get rehydrated because it was after hours at the doctor’s offices, happy, happy, joy, joy) and I find lots of interesting things to blog about showing up in my inbox. Feh.
The fever seems to have broken (knock on wood) and some of the other fun aspects of whatever virus I’ve got seem to have passed, but I’m still only partially here. I’ll see if I can catch up some this weekend…
I could have written a comment on Raymond’s blog about his dis of Krispy Kreme, but then I thought I’d write a full fledged entry. He says:
I don’t understand the appeal of KK donuts. They have no flavor; it’s just sugar.
…and with that, Raymond proves that he did not grow up in the South. First, it is a maxim of Southern cooking that anything that tastes good with sugar will taste even better with lots more sugar. Case in point: iced tea. Frankly, I don’t know how the rest of the world manages to drink tea without a ton of sugar dumped in it. My mom’s recipe for iced tea? Make about a quart of tea and then dump in 3/4 a cup of sugar. Sure, what you get tastes more like sugar water than tea, but that’s the point. Southerners would pour raw granulated sugar down their throats if their stomachs could handle it… Which, now that I think about it, may explain Krispy Kremes…
Of course, the reality is that my opinions of Krispy Kremes, like Raymond’s opinions of Dunkin Donuts, have been bred in. Growing up in North Carolina, home of Krispy Kreme, I’ve been eating those suckers since I was a kid. Nothing says “home” like Krispy Kreme, which is why the KK stores opening up in the Pacific Northwest have induced a kind of cognitive dissonance every time I drive by them. (It’s also weird to see stores that are so new… I’m more used to stores that are so old my parents went to them as kids. I also think those older machines make better doughnuts, but I could be wrong on that.) Now if we could just get a damn Chick-Fil-A franchise opened up in Seattle, I’d never have to go home to see my parents again!
I will have to admit, though, that Seattlites were totally out of control when the first KK opened in Issaquah. There were, like, hour long lines at 4 in the morning. People would ask me, when they found out I was from a state that already had Krispy Kremes, “are they really that good?” My answer was always: “They’re really good but, hey, they’re still just doughnuts.”
(I will add that my parents are safe – I’ll continue to visit them until Bullock’s Bar B Que starts licensing franchises! Mmmmm…. hush puppies…)
I realize the last entry veers close to Chris Brumme territory in terms of its Brobdingnagian length, but it was something that I occasionally get requests for internally and thought people might find interesting. Definintely represents a previous chapter in my life…
An Updated Introduction: Seven Years Later (February 9, 2004)
I wrote this document a long time ago to capture the conclusions I had reached from working on application performance for a few years. At the time that I wrote it, the way in which Microsoft dealt with performance in its products was starting to undergo a large-scale transformation. Performance analysis up to that time had largely been an ad-hoc effort and was hampered by a lack of well-known “best practices” and, in some cases, the tools necessary to get the job done. As the transition started from the 16-bit world to the 32-bit world, this kind of approach was clearly insufficient to deal with the increasing popularity of our products and the increasing demands that were being placed on them. The past seven years have seen major changes in the way that performance is integrated into the development process at Microsoft, and many of the “rules” that I outline below have become internalized into the daily processes of most major groups in the company. Even though a lot of what follows is no longer fresh thinking, I still get requests for the document internally which leads me to believe that there’s still value in saying things that many people already know. So I’ve decided to provide in publicly here, for your edification and in the hope that someone might find it useful. (Historical note: The “major Microsoft application“ that I refer to below was not VB.)
A Short Introduction (May 28, 1997)
In the fall of 1995, I was part of a team working on an upgrade to a major Microsoft application when we noticed that we had a little problem — we were slower than the previous version. Not just a little slower, but shockingly slower. This fact had largely been ignored by the development team (including myself), since we assumed it would improve once we were finished coding up our new features. However, after a while the management team started to get a little worried. E-mails were sent around asking whether the developers could spend some time focusing on performance. Thinking “How hard could this be?” I replied, saying that I’d be happy to own performance. This should be easy, I thought naively, just tweak a few loops, rewrite some brain-dead code, eliminate some unnecessary work and we’ll be back in the black. A full two years (and a lot of man-years worth of work on many people’s part) later, we finally achieved the performance improvements that I was sure would only take me alone a few months to achieve.
Unsurprisingly, I had approached the problem of performance with a lot of assumptions about what I would find and what I would need to do. Most of those assumptions turned out to be dead wrong or, at best, wildly inaccurate. It took many painful months of work to learn how to “really” approach the problem of performance, and that was just the beginning — once I figured out what to do, I discovered that there was a huge amount of work ahead! From that painful experience, I have come up with a set of “Ten Rules of Performance,” intended to help others avoid the errors that I made. So, without further ado…
Rule #1: Don’t assume you know anything.
In the immortal words of Pogo, “We have met the enemy, and he is us.” Your biggest enemy in dealing with performance is by far all the little assumptions about your application you carry around inside your head. Because you designed the code, because you’ve worked on the operating system for years, because you did well in your college CS classes, you’re tempted to believe that you understand how your application works. Well, you don’t. You understand how it’s supposed to work. Unfortunately, performance work deals with how things actually work, which in many cases is completely different. Bugs, design shortcuts and unforeseen cases can all cause computer systems to behave (and execute code) in unexpected, surprising ways. If you want to get anywhere with performance, you must continuously test and re-test all assumptions you have: about the system, about your components, about your code. If you’re content to assume you know what’s going on and never bother to prove you know what’s going on, start getting used to saying the following phrase: “I don’t know what’s wrong… It’s supposed to be fast!”
Rule #2: Never take your eyes off the ball
For most developers, performance exists as an abstract problem at the very beginning of the development cycle and a concrete problem at the very end of the cycle. In between, they’ve got better things to be doing. As a result, typically developers write code in a way that they assume will be fast (breaking rule #1) and then end up scrambling like crazy when the beta feedback comes back that the product is too slow. Of course, by the time the product is in beta there’s no real time to go back and redesign things without slipping, so the best that can be done is usually some simple band-aiding of the problem and praying that everyone is buying faster machines this Christmas.
If you’re serious about performance, you must start thinking about it when you begin designing your code and can only stop thinking about it when the final golden bits have been sent to manufacturing. In between, you must never, ever stop testing, analyzing and working on the performance of your code. Slowness is insidious — it will sneak into your product while you’re not looking. The price of speed is eternal vigilance.
Rule #3: Be afraid of the dark
Part of the reason why development teams find it so easy to ignore performance problems in favor of working on new features (or on bugs) is that rarely, if ever, is there anything to make them sit up and pay attention. Developers notice when the schedule shows them falling behind, and they start to panic when their bug list begins to grow too long. But most teams never have any kind of performance benchmark that can show developers how slow (or fast) things actually are. Instead, most teams thrash around in the dark, randomly addressing performance in an extremely ad hoc way and failing to motivate their developers to do anything about the problems that exist.
One of the most critical elements of a successful performance strategy is a set of reproducible real-world benchmarks run over a long period of time. If the benchmarks are not reproducible or real-world, they are liable to be dismissed by everyone as insignificant. And they must be run over a long period of time (and against previous versions) to give a real level of comparison. Most importantly, they must be run on a typical user’s machine. Usually, coming up with such numbers will be an eye opening experience for you and others on your team. “What do you mean that my feature has slowed down 146% since the previous version?!?” It’s a great motivator and will tell you what you really need to be working on.
Rule #4: Assume things will always get worse
The typical state of affairs in a development team is that the developers are always behind the eight ball. There’s another milestone coming up that you have to get those twenty features done for, and then once that milestone is done there’s another one right around the corner. What gets lost in this rush is the incentive for you to take some time as you go along to make sure that non-critical performance problems are fixed. At the end of milestone 1, your benchmarks may say that your feature is 15% slower but you’ve got a lot of work to do and, hey, it’s only milestone 1! At the end of milestone 2, the benchmarks now tell you your feature is 30% slower, but you’re pushing for an alpha release and you just don’t have time to worry about it. At the end of milestone 3, you’re code complete, pushing for beta and the benchmarks say that your feature is now 90% slower and program management is beginning to freak out. Under pressure, you finally profile the feature and discover the design problems that you started out with back in milestone 1. Only now with beta just weeks away and then a push to RTM, there’s no way you can go back and redesign things from the ground up! Avoid this mistake — always assume that however bad things are now, they’re only going to get worse in the future, so you’d better deal with them now. The longer you wait, the worse it’s going to be for you. It’s true more often than you think.
Rule #5: Problems have to be seen to be believed (or: profile, profile, profile)
Here’s the typical project’s approach to performance: Performance problems are identified. Development goes off, thinks about their design and says “We’ve got it! The problem must be X. If we just do Y, everything will be fixed!” Development goes off and does Y. Surprisingly, the performance problems persist. Development goes off, thinks about their design and says “We’ve got it! The problem must be A. If we just do B, everything will be fixed!” Development goes off and does B. Surprisingly, the performance problems persist. Development goes off… well, you get the idea. It’s amazing how many iterations of this some development groups will go through before they actually admit that they don’t know exactly what’s going on and bother to profile their code to find out. If you can’t point to a profile that shows what’s going on, you can’t say you know what’s wrong.
Every developer needs a good profiler. Even if you don’t deal with performance regularly, I say: Learn it, love it, live it. It’s an invaluable tool in a developer’s toolbox, right up there with a good compiler and debugger. Even if your code is running with acceptable speed, regularly profiling your code can reveal surprising information about it’s actual behavior.
Rule #6: 90% of performance problems are designed in, not coded in
This is a hard rule to swallow because a lot of developers assume that performance problems have more to do with code issues (badly designed loops, etc) than with the overall application design. The sad fact of the matter is that in all but the luckiest groups, most of the big performance problems you’re going to confront are not the nice and tidy kind of issues where someone is doing something really dumb. Instead, it’s going to be extremely difficult to pinpoint situations where several pieces of code are interacting in ways that end up being slow. To solve the problems usually requires a redesign of the way large chunks of your code are structured (very bad) or a redesign of the way several components interact (even worse). And given that most pieces of an application are interrelated these days, a small change in the design of one piece of code may cascade into changes in several other pieces of code. Either way it’s not going to be simple or easy. That’s why you need to diagnose problems as soon as you can and get at them before you’ve piled a lot of code on top of your designs.
Also, don’t fall in love with your code. Most programmers take a justifiable pride in the code that they write and even more pride in the overall design of the code. However, this means that many times when you point out to them that their intellectually beautiful design causes horrendous performance problems and that several changes are going to be needed, they tend not to take it very well. “How can we mar the elegance and undeniable usability of this design?” they ask, horrified, adding that perhaps you should look elsewhere for your performance gains. Don’t fall into this trap. A design that is beautiful but slow is like a Ferrari with the engine of a Yugo — sure, it looks great, but you certainly can’t take it very far. Truly elegant designs are beautiful and fast.
Rule #7: Performance is an iterative process
At this point, you’re probably starting to come to the realization that the rules outlined so far tend to contradict one another. You can’t gauge the performance of a design until you’ve coded it and can profile it. However, if you’ve got a problem, it’s most likely going to be your design, not your code. So, basically, there’s no way to tell how good a design is going to be until it’s too late to do anything about! Not exactly, though. If you take the standard linear model of development (design, code, test, ship), you’re right: it’s impossible to heed all the rules. However, if you look at the development process as being iterative (design, code, test, re-design, re-code, re-test, re-design, re-code, re-test, …, ship), then it becomes possible. You will probably have to go through and test several designs before you reach the “right” one. Look at one of the most performance obsessed companies in the software business: Id Software (producers of the games Doom and Quake). In the development of their 3D display engines (which are super performance critical) they often will go through several entirely different designs per week, rewriting their engine as often as necessary to achieve the result they want. Fortunately, we’re not all that performance sensitive, but if you expect to design your code once and get it right the first time, expect to be more wrong than right.
Rule #8: You’re either part of the solution or part of the problem
This is a simple rule: don’t be lazy. Because we all tend to be very busy people, the reflexive way we deal with difficult problems is to push them off to someone else. If your application is slow, you blame one of your external components and say “it’s their problem.” If you’re one of the external components, you blame the application for using you in a way you didn’t expect and say “it’s their problem.” Or you blame the operating system and say “it’s their problem.” Or you blame the user for doing something stupid and say “it’s their problem.” The problem with this way of dealing with these things is that soon the performance issue (which must be so
lved) is bouncing around like a pinball until it’s lucky enough to land on someone who’s going to say “I don’t care whose problems this is, we’ve got to solve it” and then does. In the end, it doesn’t matter whose fault it is, just that the problem gets fixed. You may be entirely correct that some boneheaded developer on another team caused your performance regression, but if it’s your feature it’s up to you to find a solution. If you think this is unfair, get over it. Our customers don’t blame a particular developer for a performance problem, they blame Microsoft.
Also, don’t live with a mystery. At one point in working on the boot performance of my application, I had done all the right things (profiled, re-designed, re-profiled, etc) but I started getting strange results. My profiles showed that I’d sped boot up by 30%, but the benchmarks we were running showed it had slowed down by 30%. My first instinct was to dismiss the benchmarks as being wrong, but they had been so reliable in the past (see rule #3) that I couldn’t do that. So I was left with a mystery. My second instinct was to ignore this mystery, report that I’d sped the feature up 30% and move on. Fortunately, program management was also reading the benchmark results, so I couldn’t slime out of it that easily. So I was forced to spend a few weeks beating my head against a wall trying to figure out what was going on. In the process, I discovered rule #9 below which explained the mystery. Case closed. I’ve seen many, many developers (including myself on plenty of other occasions) fall into the trap of leaving mysteries unsolved. If you’ve got a mystery, some nagging detail that isn’t quite right, some performance slowdown that you can’t quite explain, don’t be lazy and don’t stop until you’ve solved the mystery. Otherwise you may miss the key to your entire performance puzzle.
Rule #9: It’s the memory, stupid.
As I mentioned above, I reached a point in working on speeding up application boot where my profiles showed that I was 30% faster, but the benchmarks indicated I was 30% slower. After much hair-pulling, I discovered that the profiling method that I had chosen effectively filtered out the time the system spent doing things like faulting memory pages in and flushing dirty pages out to disk. Given that: 1) We faulted a lot of code in from disk to boot, and 2) We allocated a lot of dynamic memory on boot, I was effectively filtering out a huge percentage of the boot time out of the profiles! A flip of a switch and suddenly my profiles were in line with the benchmarks, indicating I had a lot of work to do. This taught me a key to understanding performance, namely that memory pages used are generally much more important than CPU cycles used. Intellectually, this makes sense: while CPU performance has been rapidly increasing every year, the amount of time it takes to access memory chips hasn’t been keeping up. And even worse, the amount of time it takes to access the disk lags even further behind. So if you have really tight boot code that nonetheless causes 1 megabyte of code to be faulted in from the disk, you’re going to be almost entirely gated by the speed of the disk controller, not the CPU. And if you end up using so much memory that the operating system is forced to start paging memory out (and then later forced to start paging it back in), you’re in real trouble.
Rule #10: Don’t do anything unless you absolutely have to
This final rule addresses the most common design error that developers make: doing work that they don’t absolutely have to. Often, developers will initialize structures or allocate resources up front because it simplifies the overall design of the code. And, to a certain degree, this is a good idea if it would be painful to do initialization (or other work) further down the line. But often times this practice leads to doing a huge amount of initialization so that the code is ready to handle all kinds of situations that may or may not occur. If you’re not 100% absolutely sure that a piece of code is going to need to be executed, then don’t execute it! Conversely, when delaying initialization code, be aware of where that work is going to be going. If you move an expensive initialization routine out of one critical feature and into another one, you may not have bought yourself much. It’s a bit of shell game, so be aware of what you’re doing.
Also, keep in mind that memory is as important as code speed, so don’t accumulate state unless you absolutely have to. One of the big reasons why memory is such a problem is the mindset of programmers that CPU cycles should be preserved at all costs. If you calculate a result at one point in your program and you might need it later on elsewhere, programmers automatically stash that result away “in case I need it later.” And in some expensive cases, this is a good idea. However, often times the result is that ten different developers think, “All I need is x bytes to store this value. That’s much cheaper than the cycles it took me to calculate it.” And then soon your application has slowed to a crawl as the memory swaps in and out like crazy. Now not only are you wasting tons of cycles going out to the disk to fetch a page that now would have been (now) much cheaper to calculate, you’re also spending more of your time managing all the state you’ve accumulated. It sounds counterintuitive, but it’s true: recalculate everything on the fly; save only the really expensive, important stuff.
Something that I think I’ve forgotten to mention up until now (shame on me!): I’m going to be giving a talk on VB .NET 2.0 at the Bay .NET User Group this Thursday, February 12th. The show starts about 6:45pm and should run for about an hour and a half – more details can be found at the Bay .NET User Group page. Even though the blurb I wrote claims that I’m going to focus entirely on the language, I’m thinking I may talk about some IDE features that relate as well. Should be a good talk.
(I think that Addison-Wesley may also be sending along some pre-production copies of my book for raffling! Just a few more weeks until the real thing comes along!)