The drive-by coding anti-pattern

Almost every team does it, but it significantly degrades code health, and does even worse for team culture.

October 30, 2019 George Hadjiyiannis

10 minutes read

I am not sure why, but out of the set of all anti-patterns, drive-by coding is the one I am personally most disturbed by. Perhaps that's because it's so unnecessary. Perhaps because it's indicative of laziness on behalf of the developer. Perhaps because it is, itself, a smell of much more serious cultural issues. For exactly this last reason, if not for any other, I feel very strongly that this particular anti-pattern should be uprooted mercilessly whenever noticed, before it becomes endemic and destroys both the health of the code-base, as well as the culture of the team.

In its essence, the anti-pattern goes as follows: a developer is called upon to make a change to piece of code he is not familiar with. Instead of taking the time to understand the relevant code and identify a solution that is reasonably compatible with the structure of the existing code, he focuses on making a minimal change to just solve the immediate problem at hand. Worried that he may now be violating the integrity of the original code, he makes a copy of the code, inserts the modification in the copy, and puts a conditional in place that triggers the copy only in the cases where his modification should kick in.

While one might think that this does not happen in many teams, and where it does, it is often limited in extend, the reality does not seem reflect this thinking. Especially on software maintained by large distributed teams, the practice seems fairly widespread, with a meaningful percentage of the code being duplicated.

Impact on code health

At first, this might seem to be unintelligent but not particularly damaging. There is the obvious code duplication, which means that if you need to correct or modify the behavior of any of the duplicated code, you then need to do it at two places. Assuming that you know the code has been duplicated, then that is annoying but not too dangerous. Unfortunately, that is the lightest of the issues that this pattern causes (and that's assuming that you know and can clearly identify the duplicated code).

The first complication is that it is not obvious how one would know that there's a duplicate copy. Simply tracing the code and seeing which paths through the code get executed can easily miss the duplicate, since the conditional that triggers it is supposed to trigger only in the very limited circumstances to which the developer wanted to apply this modification. If you find it through a trace of the code, it will be only through sheer luck. At the same time, the only two ways to find the duplicate by reading the code is to either stumble upon the duplicate and notice that it is not the same as the original, or to find the conditional that switches between the two versions. Note that this last option might not be as likely as you think since the conditional may actually be quite far from the duplicate. In an attempt to minimize the damage, a developer might modify the code from

	instance.do_X();

	if (exceptional_condition)
		instance.do_modifiedX();
	else
		instance.do_X();

with the methods do_X() and do_modifiedX() being quite far from the conditional, possibly in an entirely different file. At any rate, the risk that the duplicate may not be updated is substantial. Typically the anti-pattern occurs when a developer who does not know the code well makes the modification, because the developer that does is temporarily unavailable for some reason. Future updates will most likely be made by the person that does know the code. This engineer will not know of the duplicate, and he understands the original code well enough to be able to modify it without having to read the surrounding code. Unless either the duplicate or the conditional end up in the same page as the original, he is unlikely to notice the changes and will therefore miss the need to update the duplicate.

In addition, it is rarely the case in drive-by coding that automated tests are generated to test the new path. After all, if there was no time or inclination to properly understand the code one is modifying, it is unlikely that there was time or desire to write the necessary automated tests. As a result, this already fairly brittle modification can now fail silently, especially in the scenario where the original was updated but the engineer did not know the duplicate existed, and therefore failed to update the duplicate.

A different, and even more serious challenge is the fact that the duplicate may impact the operation of the original. This may sound counterintuitive; after all, the whole point of creating a duplicate was to avoid having to modify the original, and thus ensure that what used to work before, still works after the modification. Unfortunately, duplicating the code is not enough to ensure that the original will still work, as it is possible that the duplicate will violate data invariants that the original relies on. Imagine, for example, that the original code writes records to a file handle in batches, along with some metadata for each batch. If the duplicate code writes to the same file handle, then it is easy to see how the batches end up interleaved, and the metadata get out of sync with the file records. One could try to solve this by making sure that the duplicate updates the metadata of the original, but that can make it even more likely that an error could corrupt the metadata and break the original.

Additional challenges crop up with code comprehension. Remember that typically the duplicate is created by someone other than the person or persons that know the original code well. The problem is that now, that person or persons (perhaps the author of the original), does not know the duplicate well. This means that, even though the two pieces of code together form a unit of functionality, there is no single person that understands it completely. In other words, the existence of the duplicate has the effect of reducing the extend to which the engineers understand the code that has not been modified! This is exemplified by the fact that in cases where the original gets replaced by a new implementation (and as a result, removed), the duplicate sometimes remains, for the same reason that the code was duplicated in the first place: that the developer does not understand the impact of removing it and tries to minimize the risk. This leads to a pattern where a lot of code duplicated in this fashion becomes dead code, but cannot be removed without spending a lot of time analyzing the logic behind everything. Needless to say, having so much dead code lying around further complicates the problem of code comprehension.

Impact on culture

As serious as the effects on code health are, in my eyes they take a back seat to the damage done to team culture. This anti-pattern tends to not only normalize, but eventually even validate incorrect behavior, making it that much harder to resist, let alone reverse, the decay in best practices that inevitably follows. A lot of it has to do with how this anti-pattern occurs in the first place. Imagine the following scenario: a modification needs to be made, either to accommodate an extension of functionality or to fix a bug. Unfortunately the person that knows the code to be modified is on vacation. The change is not time critical, and could easily be deferred for when that person returns. Except, that is, for the fact that the sales team is visiting the customer that asked for the modification in the hope of cross-selling further products, and they are certain the customer will ask about the change. They apply significant business pressure and eventually some unwitting developer agrees to make a minimal change in the fashion described above. The change is made and pushed to production without the sky falling, and the sales team convinces the client to sign a fairly significant addition to the contract. Eventually the change will cause significant challenges, but these will come so far into the future, that the fact that a risky business decision was taken whose result this was, is long forgotten.

In general, this anti-pattern tends to occur as a result of business pressure forcing a risky decision. The benefit to the company is obvious and immediate, but the consequences are temporally very far removed. This tends to validate the risky business decision and reinforce this behavior. And just to make it absolutely clear, my finger is not pointing just at the business people that applied the pressure. The engineers themselves are often just as much to blame, and their behavior just as strongly reinforced. Think about it: drive-by coding is certainly bad behavior in the realm of engineering best-practices, but some engineer whose standards of code hygiene are perhaps too low, decided to go for it. The fact that the business value was extracted immediately, but the cost is nowhere to be seen yet, is validating his behavior as well. When an engineer with a healthier attitude towards code health reproaches said developer, he will simply answer that it was worth it, because the company got the contract and nothing broke. This will certainly erode the engineering culture as well.

There is actually even second-order effects that cause this erosion in culture to snowball. First of all, the sales team in the scenario above is likely to heap all sorts of public praise on the renegade developer that went along with the risky decision, while at the same time castigating those engineers that showed enough wisdom to resist. This will inevitably cause any engineers that do not consider this a matter of principle to wonder if aligning with such risky decisions is the best way to be rewarded. In other words, this will transform the culture of the company to one that rewards short-term risk, and negatively views as excessive almost all risk-management, however prudent. The second effect, is that it will create a bias towards unholy alliances: in the example above, sales people with a horizon that extends no further than the next contract signature, will team up with developers with more desire to please than wisdom, forming a silo that will be hard to break up, and is extremely prone to unreasonably risky behavior. This silo will reduce the ability of anyone on the outside to correct what is happening, and even enable a sub-culture to form, which will align against the culture of more risk-aware people in the company. If you have read my previous article on “Evolving Team Culture” you already know how dangerous such antagonistic sub-cultures are.

Going Forward

I'd like to leave you with just two points:

Most people find it hard to believe that this could happen to them. They are fairly certain that they do not have this anti-pattern. This, unfortunately, includes every team where I found the anti-pattern in practice. Don't rely on faith in your organization, but instead go have a look at your code. The tell-tale sign of drive-by coding is code duplication. Most modern code analysis tools (such as SonarQube) can measure and locate code duplication. Not all duplication is the result if this particular anti-pattern, but unless you have almost no duplication I would look at what's behind the scenes and make sure.
If you do see drive-by coding, show it no mercy! You need to engage in an effort to change culture, and sustain this effort for the long-term. Make sure that your engineers consider practicing drive-by coding with the same apprehension as cheating on their spouse, or stealing from their parents. And teach the entire organization that the company is guaranteed to pay for the consequences, even if this happens long after the contract is signed. If you don't, you risk paying a price much higher than having a buggy software product.