Duplication versus Boilerplate

In my mind there’s a big difference between duplication and boilerplate.

Code duplication occurs when two functions (or two blocks of code more generally) do approximately the same thing with minor variations.  If those two blocks of code occur in different places in the code-base, then likely when one gets changed (ie to fix a bug) the other will be ignored.  Such is the case with the old implementation of “find” in which there were multiple places where the dimensions of the return-matrix are computed (among other things).  When you eliminate code duplication, you generally have to make the replacement code more generic, but then you also need ways to construct specific instances to be passed to the generic code.

In an intelligent modern language like D, or a language which relies on higher-level functions like haskell, Ocaml, lisp (any functional language really), the story ends there.  You make your code more generic and pass the proper instantiations to it.

In an old cluttered imperative language like C++, constructing the proper instances of a generic function tends to require a lot of boilerplate code which is code that consists of many short blocks with the same structure.  Boilerplate occurs all in one place, and unlike with code duplication, there’s no danger of an unsuspecting future developer bugfixing your code by changing one block while forgetting to change the others.  But it does indeed include “duplicated code”.  The difference is, the code in question tends to be a single function call or a method header.  It’s short, and difficult to abstract any further (without changing languages).

If you see a line of code that looks like:
switch (n)
{
  case 1: return call_func<1> (args);
  case 2: return call_func<2> (args);
  case 3: return call_func<3> (args);
  default:
     error(“call_func should be called with a number from 1 to 3”);
}

 

please recognize this for what it is.  This is boilerplate.  It’s necessarily ugly because it’s written in C++, but it’s not “code duplication” in the same way two implementations of binary search in different files is “code duplication”.  You can be sure whoever wrote this code struggled with it and thought about it and determined that this was the most generic way to do whatever they were trying to do.

 

Leave a comment