Go Easy on the Maintenance Programmer

"Maintenance Programmer": This common, and underappreciated species of programmer is often heard muttering and cursing under their breath. Sometimes bald from tearing their hair out in frustration.

From time to time we all have to do it: simple changes and maintenance to code - often other people's code. And the truth is that most code spends far more time in maintenance mode than it did in design, coding, or initial debugging. Yet maintenance programming is notoriously difficult (even if it does tend to get assigned to the most junior developers on a team). I will identify the main reason I think maintenance is difficult, and suggest one specific programming practices that can make your code easier to maintain.

A typical maintenance task is to make some tiny change in a larger existing codebase. The change might be to add a new feature ("we need to support a new kind of mortgage loan") or it might be to fix a simple bug ("the odd days interest calculation doesn't match what accounting is producing"). Except in very rare and fortunate cases, the maintenance programmer is not particularly familiar with the code in question. They are expected to read and understand it, make the change, whatever else happens, they must NOT introduce any new bugs, and since, after all, it's "just a tiny change" they are expected to do this very quickly.

Of course, the change would have been tiny if it had been in the original requirements when the code was being written. And it might not be too much trouble if it was written last week by you, so you still remember the whole structure - the classes, the methods, the data structures, and how they all work together. But a maintenance programmer rarely has this luxury: she is typically given a large amount of code with which she is unfamiliar and asked to make some small change.

If the code is perfectly structured for this particular change, then it's still not too difficult: just add a new entry to the "supported loan types" array, or modify the one-line formula where the odd days interest is calculated. The real challenge comes when the code needs to be refactored in order to support the new "tiny change" - perhaps lots of places in the code assume that that all mortgage loans have unique subtypes (and this new one doesn't) or perhaps the odd-days interest calculation is spread out among several classes. (Both are real examples from last week's coding.)

So the maintenance programmer needs to take a codebase she is unfamiliar with, and make a small refactoring - all while keeping foremost in mind that she must NOT risk introducing new bugs.

There are lots of little things that you can do when writing code to make things easier for the maintenance programmer. I strongly recommend writing a "method header" comment for every meaningful method (whether public or private) explaining what it does. I recommend avoiding anything that seems "clever", unless you have a very good reason and extensive comments. I recommend spending an inordinate amount of time carefully choosing good names for your classes, methods, and variables. But these are well-known practices; for this essay I'll try to describe one that is less well-known. I picked it up while teaching myself functional languages (like Haskell).

The programming practice I advocate is this: maximize locality of reference by eschewing side effects and variable rebindings. Which just goes to show that you can make anything sound difficult if you use enough long words.

The key is to realize that the maintenance programmer is not dumb, she is just trying to make a quick change in an unfamiliar codebase. When I make a small changes as a maintenance programmer, it would not be atypical for me to spend 3-4 hours reading through the code and 10 minutes typing out the change. That time is spent trying to locate the code that affects this feature and making sure that my change won't break anything. If code were written in a way that made it easier to be sure whether a change will break things, that would make it much easier to maintain.

Any time that you go to modify a function that has side effects, you have to carefully read through every bit of code that touches the variables involved in that side effect in order to understand whether the code you are changing may break things. If those have side effects you may have to track that down as well. This is how it can take hours to decide whether 10 minutes of trivial code is safe to write.

For instance, here is a simple Java method that submits user orders.

public void submitOrders(List userOrders) {
    for (Order order: userOrders) {
        if (order.submit()) {
            this.totalValidOrders++;
        }
    }
}

The maintenance programmer has been asked to submit system orders as well as user orders. You might think this would be straightforward change:

public void submitOrders(List userOrders, List systemOrders) {
    for (Order order: userOrders) {
        if (order.submit()) {
            this.totalValidOrders++;
        }
    }
    for (Order order: systemOrders) {
        if (order.submit()) {
            this.totalValidOrders++;
        }
    }
}

And hey, it will only take you 10 minutes to type out this change! But it isn't safe: that change may break things. There is a side affect on the instance variable totalValidOrders, and before you can make this change, you need to understand every use of the totalValidOrders variable. Are there places that assume that, after a call to submitOrders(), it will contain the total number of user orders? When system orders are handled, will those places expect to see system orders included? And are there other methods, like invalidateOrders(), that ALSO modify the totalValidOrders variable, and how do THOSE modifications interact? Researching this could take quite some time.

The whole exercise would be much easier for the maintenance programmer if the original designer had used a more functional style of programming without side effects. Suppose that the original looked like this:

/**
 * Submits the orders that were passed in and returns a count of the valid orders.
 */
public int submitOrders(List userOrders) {
    int validOrders = 0;
    for (Order order: userOrders) {
        if (order.submit()) {
            validOrders++;
        }
    }
    return validOrders;
}

Now the maintenance programmer has a much easier job. She does not need to investigate every use of some variable and try to determine the order in which other methods on the same object get called. Instead, she simply needs to examine the places where the submitOrders() method is invoked and figure out whether those need the total count of valid orders or just the count of valid user orders. This is a much more deterministic task, because it involves examining a known list of invocation spots, instead of requiring that one understand every use of a certain object (or at least a field of an object). The change to this:

/**
 * Submits the user and system orders that were passed in and returns a count
 * of the total valid orders.
 */
public int submitOrders(List userOrders, List systemOrders) {
    int validOrders = 0;
    for (Order order: userOrders) {
        if (order.submit()) {
            validOrders++;
        }
    }


    for (Order order: systemOrders) {
        if (order.submit()) {
            validOrders++;
        }
    }
    return validOrders;
}

...can be made with far less research.

Posted Fri 18 July 2008 by mcherm in Programming