Ending Legacy Code In Our Lifetime

Most software developers don't have an exact definition of legacy code, but to paraphrase Potter Stewart, they "know it when they see it." Unfortunately, legacy code has something else in common with Stewart's famous subject--it is embarrassingly ubiquitous.

What is legacy code?

Legacy is the degree to which code

  • fails to capture essence
  • fails to communicate essence
  • captures irrelevant detail (ceremony)

The modern interpretations of legacy code are all implications or symptoms of the above. For example, consider the idea of legacy code as "written by someone else and impossible to maintain." This indicates a failure to capture and communicate essence, usually on multiple levels:

  • Code is unreadable. This is easy to test: Can you figure out what the code does without the original developer at your shoulder?
  • Often even the original developer can't read the code.
  • Code does not have unit or acceptance tests. Code without tests does not capture essence. It is automatically legacy code, even if it was written this morning.

The Next Big Thing

Organizations often jettison legacy code when they jump to the Next Big Thing. In the circles I run in, Java Web Development was the Last Big Thing. Ruby and Rails Web Development is the Current Big Thing. The Next Big Thing is ... well, a subject for a different post.

For an IT buyer, these Big Things are Sisyphean moments. They get to start over at the bottom of the hill, rolling the same damn rock again. This isn't all bad: Platform shifts provide a rare and brief opportunity to jettison a codebase that is legacy crap. Unfortunately, there is little reason to believe that the next codebase will be much better.

It doesn't have to be this way. As software developers, our goal should be to build this generation's code so that the Next Big Thing won't force a complete "do over." That's pretty ambitious, but if we try hard and miss, the consolation prize would still be pretty good: improved maintainability and reuse within this technology cycle.

To get there, we have to start by rejecting ceremony.

Essence vs. Ceremony

Good code is the opposite of legacy code: it captures and communicates essence, while omitting ceremony (irrelevant detail). Capturing and communicating essence is hard; most of the code I have ever read fails to do this at even a basic level. But some code does a pretty good job with essence. Surprisingly, this decent code still is not very reusable. It can be reused, but only in painfully narrow contexts, and certainly not across a platform switch.

The reason for this is ceremony: code that is unrelated to the task at hand. This code is immediate deadweight, and often vastly outweighs the code that is actually getting work done. Many forms of ceremony come from unnecessary special cases or limitations at the language level, e.g.

  • factory patterns (Java)
  • dependency injection (Java)
  • getters and setters (Java)
  • annotations (Java)
  • verbose exception handling (Java)
  • special syntax for class variables (Ruby)
  • special syntax for instance variables (Ruby)
  • special syntax for the first block argument (Ruby)

High-ceremony code damns you twice: it is harder to maintain, and it needs more maintenance. When you are writing high-ceremony code, you are forced to commit to implementation approaches too early in the process, e.g. "Should I call new, go through a factory, or use dependency injection here?" Since you committed early, you are likely to be wrong and have to change your approach later. This is harder than it needs to be, since your code is bloated, and on it goes.

Getting to Essence

I have found the following techniques helpful in writing low-ceremony code.

  • Identify and implement real needs in a tight feedback loop with the customer.
  • 100% test coverage. When you are testing your own code, you are temporarily in the role of the client. If you find your own code painful to use, others surely will. (The drive to coverage also flushes out the weak: It is difficult for weak developers, or weak languages, to get 100% code coverage. Of course this is only a proxy measure for quality, but it is a start.)
  • Relentless code review, both during pairing and after the fact.
  • Insisting on well-composed methods. And classes. And packages! This little-documented pattern is simple to learn, difficult to master, and more important than the entire rest of the design pattern movement.

What has worked for you?


Notes

  • This essay began as a keynote at Code Freeze 2008. The slides are available.
  • The techniques listed above are not silver bullets, they don't work in isolation, and they require strong developers. For examples, I have seen 100% test coverage on very bad code. You will need a synergy of all these techniques, plus probably some others that I have failed to identify.
  • I first encountered Compose Method in Smalltalk Best Practice Patterns. Recommended.