Monday, October 18, 2010

The Art of Firefighting - Part 2


Localization of reference is a concept that holds true throughout the software development life cycle. There is a clear pattern of few chunks of code evolving significantly at any given time. These could be most bug infested code segments or segments most impacted by a change request or impact of a feature addition - irrespective of the nature and phase of software development, there would invariably be few high churn code chunks (I like to call them fire pockets).

Exactly what chunks are churning may vary with the development life cycle stage, e.g. towards the beginning of development, framework and API code might see significant changes while other modules might just be static stubs while later frameworks would have stabilized but other modules would see high activity. Still later in the development cycle some bug infested fire pockets might see a lot of code change.

How is this information relevant for firefighting? :-)

Let us examine some typical characteristics of our situation.

- There isn't enough time. Not the routine "not enough time" but a severe time constraint.
- Lack of time often leads to less than usual unit testing and QA testing.
- Code changes are likely to introduce more bugs.
- More the code changes, more likely that new bugs would be introduced and this relationship is not linear.

Add to these the fact that bulk of code changes are happening in a few fire pockets and voila next mantra to fire fighting is right there - "do not mix tasks". Never ever try to fix multiple issues in one go. It would take far less time to fix an issue, test it, check in the fix, and then proceed to tackle the next issue rather than to implement two or more fixes at one go and then proceed with testing them together.

While you are in the moment, it might appear that you have a handle on the issues and you can easily fix more than one issues at one go but this is similar to running hard and fast in a forest. Every new bug introduced would take so much more time to find and fix simply because lot more code has churned and consequently has low confidence value.

I have witnessed way too many delays and heart burns originating from the temptation of clubbing fixes and yet I have to concede that temptation to add just one more fix to the mix is strong. It is a fire that fuels itself.

To carry the jungle analogy further, if you try to build a shelter and cook food at the same time, you are likely to send an open invitation for dinner to some cohabitants and guess who's on the menu?

Moreover, if you pause and think, how much time difference are we talking about in the two approaches: when you fix one, test it, check it in and then pick up next Vs when you fix both, test them both, and check them in together. Even if we assume best case scenario, i.e. no new bugs were introduced, the only effort saved is time for an extra code commit and one round of set of tests that were common to both the fixes. Given that you are in a fire pocket, racing against time, with reduced testing effort, I'd leave it up to you to calculate the probability of such a best case scenario and its RoI.

This article is second in the series, you can find the first one here - The Art of Firefighting - Part 1.

PS: Pointers to relevant Dilbert cartoons are welcome.

No comments:

Post a Comment