How do you review code that accompanies a research project or paper? Help rOpenSci plan a Community Call

I’m :100: percent with Hao’s comment about how refactoring makes the notion of unit tests pretty hard in research code. In software development, I often have a reasonable idea what the API should look like, so even while I still need to do a deal of refactoring of internals, it doesn’t break anything. In contrast, research code often isn’t even functions to begin with. Still, it maybe possible to think of a somewhat different testing/assertion paradigm for (rapidly evolving) research scripts. For instance, there was an interesting effort to develop a testing framework for Rmds at the 2017 unconf; https://github.com/ropenscilabs/testrmd.

Ironically, I think the more common problem I encounter in trainee code is too little refactoring. Am I alone in thinking this? I believe it’s easy to get stuck feeling “this giant block of code took me 2 weeks to write, I’ll just continue to modify and extend it”, when the more sensible thing is more often to re-write into smaller functions, not bigger ones. Refactoring a script that already “works” can seem like a waste of time and only risk breaking things. When is it time to refactor, and how do folks encourage refactoring?

6 Likes