October | 2013 | Panopticon Central

A friendly word of advice: if you’re thinking of rewriting your codebase… don’t. Just don’t. Please. Really.

Yes, I know, your codebase as it exists today is a steaming pile of crap. Yes, I know, that you had no idea what you were doing when you first wrote it. Or that it was written by idiots who have long since moved on. Or that the world has changed and whatever style of development that was in vogue at the time is now antiquated. Yes, I know that you’re tired of dealing with the same old architectural limitations over and over again. And that the programming language you wrote it in is no longer fashionable. And that there are lots of new features in the newer versions of the language that would allow you to express things so much more elegantly. Or maybe, god help me, you think that you can write it more modularly this time and that will allow you to more quickly iterate on new features.

Whatever you think, let me make a bold pronouncement: YOU ARE WRONG. WRONG, WRONG, WRONG. FOR THE LOVE OF GOD STOP WHAT YOU ARE DOING RIGHT NOW AND PUT DOWN THE KEYBOARD BEFORE YOU HURT ANYONE.

OK, now that that’s out of my system, let’s get to the natural question: why? Why am I saying this? Because if you ignore my advice and plunge ahead anyway, you’re going to run into what I modestly call:

Vick’s Law of Rewrites: The cost of rewriting any substantial codebase while preserving compatibility is proportional to the time it took to write that codebase in the first place.

In other words, if it took you one year to write your codebase, it’s going to take you on the order of one year to rewrite that codebase. But, of course, we’re not talking about codebases that are only around one year old are we? No, we aren’t. Instead, people usually start talking about rewrites about around the 5-7 year mark. Which means that, as per the law above, it’s going to take on the order of 5-7 years to rewrite that codebase and preserve any semblance of compatibility. Which is not what most people who embark on large rewrites usually think. They think that they can do it in substantially less time than it took in the first place. And they’re always wrong, in my experience.

I first came to this law way back when I worked on Access and the leaders of the project (who, I realize now, were still laughably young, but seemed very old and wise to me at the time) started talking about rewriting Access after we shipped version 2.0 (development time: approx. 4 years total at that point). At the time, I owned the grid control that was the underpinning of the Access data sheet and other controls, and let me tell you–that piece of code was a bit of a beast. Lots of corner cases to make the UI work “just so.” I was talking to one of the leads about this and he dismissed me with a proverbial wave of the hand: “Oh, no, since we understand how that code works now, we can rewrite it in three months.” I think it was at that moment I knew that the rewrite was doomed, although it took three more years for the team to get to that realization.

The bottom line is that the cost of most substantial codebases comes not from the big thinking stuff, but from details, details, details and from bugs, bugs, bugs. And every time you write new code–even if you totally understand what it’s supposed to be doing–you’ve still got the details and bugs to deal with. And, of course, it’s almost entirely unlikely that you do totally understand what it’s supposed to be doing. One of the things that continually humbled me when I worked on Visual Basic was how we’d be discussing some finer point of, say, overload resolution, and someone would point out some rule that I didn’t remember. No way, I’d say, that’s not a rule. Then I’d go check the language specification which I wrote, and there it would be, in black and white. I had totally forgotten it even though I had written it down myself. And the truth is, there are a million things like this that you will inevitably miss when you rewrite, and then you will have to spend time fixing up. And don’t even get me started on the bugs. Plus, you’re probably rewriting on a new platform (because who wants to write on that old, antiquated platform you were writing on before?), and now you’ve got to relearn all the tricks you knew about how to work with the old platform but which don’t work on the new one.

In my experience, there are three ways around this rule:

You’re working on a small or very young codebase. OK, fine. In that case it’s perfectly OK to rewrite. But that’s not really what I’m talking about here.
Decide that you don’t care so much about preserving compatibility or existing functionality. Downside: Don’t expect your users to be particularly pleased, to put it mildly (see: Visual Basic .NET).
Adopt a refactor rather than rewrite strategy. That is, instead of rewriting the whole codebase at once, take staged steps towards the long-term goal, either rewriting one component at a time, or refactoring one aspect of the codebase at a time. Then stabilizing, fixing problems, etc. Downside: Not as sexy or satisfying as chucking the whole thing out and “starting clean.” Requires actually acquiring a full understanding of your existing codebase (which you probably don’t have). Plus, this will take a lot longer than even rewriting, but at least you always have a compatible, working product at every stage of the game.

Rewriting can do enormous damage to a product, because you typically end up freezing the development of your product until the rewrite is done. Maybe at best you leave behind a skeleton crew to pump out a few minor features while the bulk of the team (esp. your star coders) works on the fun stuff. Regardless, this means your product stagnates while the rest of the world moves on. And if, god forbid, the re-write fails, then you run the risk of your team moving on and leaving the skeleton crew as the development team. I’ve seen this happen to products that, arguably, never really recover from it.

So please, learn to live with that crappy codebase you’ve got and tend to it wisely. The product you save may be your own.

You should also follow me on Twitter here.

Panopticon Central

a blog on programming languageson programming languages, the tech industry, and other stuff…

Monthly Archives: October 2013

Thinking of rewriting your codebase? Don’t.