I need to decide how to define certain parameter types.
The general situation is this: I am refactoring a piece of C++ code to be part of a separate library, so I am in the process of defining and documenting an API for using the included classes and methods. A fundamental design consideration is that the library may be called by third parties, without access to the source code, so I need to make the code as close to bulletproof as possible. (For internal development, at least I know the methods used, how the API will be utilized, and that it will not be abused terribly.)
During this process, I encountered a theoretical dilemma about how to handle certain parameters. Specifically, I was working on method definitions that included sizes and counts that should never be negative (but, of course, I need to prepare for abuse). As an example, say I am reviewing a method that is declared like this:
bool FillBuffer ( byte* pBuffer, int nSize );
Here, pBuffer is a pointer to the buffer to be filled, and nSize is the size of that buffer. Of course, it is not possible for a buffer to be a negative size, so my initial reaction was to redefine it as:
bool FillBuffer ( byte* pBuffer, unsigned uSize );
This makes perfect sense from a theoretical standpoint, but then a practical consideration occurred to me. I always verify parameters with an assertion and, for a public method (as here), abort the routine if verification fails, with an exception or error return as appropriate. In this case, my original method would assert nSize to be greater than zero (and return false), which would catch any negatives. The new method would only catch the case where uSize was zero, but if a careless programmer cast a signed integer (or, worse, let the compiler do it), the current validation check would not identify a problem.
So, there were a few obvious solutions that I considered:
- I could leave the original definition alone, which would catch obvious parameter errors, but would be theoretically incorrect, and if a programmer wanted to pass the size of a static buffer, using sizeof(), there would be a signed/unsigned mismatch.
- I could use the new method definition, which would be correct in theory, and just trust programmers not to abuse the method with invalid parameters (and let them suffer if they do).
- I could use the new method definition and add a sanity check so an extremely large buffer size (i.e., likely a negative value cast improperly) would be rejected, but the drawback there is that any such check would be somewhat arbitrary, and it would limit the functionality for any programmer who truly wanted to use an enormous buffer.
Each of these solutions has advantages and drawbacks. I dislike having a parameter take a type that is not accurate (though not so much as to not have written this code in the first place), but I dislike arbitrary limits even more. However, I know that defensive design is important here, since a careless programmer is, in my experience, the most likely to complain that the library or API is at fault. (I was once threatened with physical violence when I produced a critical review of code written by a nominal “programmer”.)
At this point, I am leaning toward a hybrid solution by overloading the method with both (or multiple) definitions, the original checking for negative sizes as usual before doing an explicit cast of the size value and passing processing to the new/correct method. The advantage is that passing an actual negative number (or signed type) will result in that extra checking, and a programmer could pass a buffer size up to the limit of the unsigned type. The disadvantages are the additional work needed to create the extra stub(s), loss of type checking during static analysis, and the fact that our careless friend could still cast a value to create problems (but then it should be quite obvious, at least).
This post is an exercise in the process of working through a problem by simply writing down the issues, which often results in a solution (or decision) by the time one is finished with the description. (It did here.) I would, however, welcome any comments on my proposed solution, or other suggestions.
Finally, yes, I know that int and unsigned are not ideal parameter types for this in the first place, but I used them for the purpose of illustration. (The principle also applies to object counts and other similar parameter types.)