Idiot-Proofing My R Scripts

For one of my research projects I’m writing software in a language called R which is in some sense optimized for statistics. Like all self-taught coders, I started out pretty rough. My code was barely maintainable. It got to the point where I discovered a bug I needed to fix in one of the oldest bits and upon looking back at it could only think to myself “How did this ever work in the first place?” I couldn’t make heads or tails of my own code! As I progressed my code got cleaner, but only knew what was going on with it. Finally, as the code base has grown, it has outgrown my ability to remember all the little details about what needs to go in where.

Anyway, I’m packaging up all my code in order to submit it at some point to CRAN. As part of my effort I’m working on idiot-proofing my software–as if that’s even possible. What this translates to is I’m going in and putting in error checking code, especially code that checks the inputs for errors. Of course, I could go on and on and on with this sort of thing. My paradigm is that I will write good documentation on all my software (okay, it’s happening slowly but surely) and then in the code implement some basic checks. Things like checking that the names fed to the functions make sense for instance. The next phase of this plan–which I will probably save for version 1.1 of the package–would be do do type checking on the arguments provided by the user.

While this has all been well and good for my sanity and keeps me busy writing code, I’ve noticed two things. First, I’m writing, it seems like, far more error check code than code to get work done. It could just be that checking for the correct argument names is a pretty bulky process, because type checking is really pretty easy to do in R. The second thing is that it’s helping me as it will catch when I do something stupid I’ve told it to look for, which is pretty cool.

Now all I need to do is actually get this whole thing done so I can finally get it submitted.


About this entry