So far my approach has been to focus more on structure / organization than on function.
I expect code and other material associated with a given research project of a trainee to be on a (public or private) GitHub repository. I encourage trainees to follow the R package / ‘compendium’ model of organization for their repo (e.g. https://github.com/cboettig/compendium), with analyses in .Rmd notebooks, functions in an R/
directory, a README.md and a DESCRIPTION file.
Usually I don’t worry about this including unit tests or roxygen documentation. I like folks to have .travis.yml that runs some or all of the .Rmd notebooks, but this isn’t always realistic. In many cases it would take too long, and in any event it would almost always be a better check to have more minimal unit tests testing the functions/code against circumstances where we already know what the result should be, but in practice I haven’t been that successful in getting folks to write “trivial” tests. I’d like to also turn on linting as part of the automated checks, but so far haven’t inflicted that on anyone. (My classroom students we’ll probably be the first guinea pigs on the linting this semester).
Actual code review has been very ad hoc; i.e. I may comment on how code could be improved during a discussion with a trainee about a result or about attempting to debug something, but rarely have I done code review explicitly; unless we are developing a software package as the main objective. I do code review in my classroom, but would like this practice to play a bigger role in my research trainees as well. Often it feels hard to review code when you don’t understand what it is doing in the first place; but I’m slowly getting over that as I learn more about how robust and maintainable code should look (e.g. code smells) and feel more confident in suggesting how to improve code which I don’t understand.