How do you review code that accompanies a research project or paper? Help rOpenSci plan a Community Call

This is something I have been thinking about & working on lately. The review guidance I present to my workshop attendees is focused on code review for the purpose of publishing computationally reproducible code that supports a paper, so perhaps a different objective than others in the thread. Looking forward to hearing how others do it in their labs!

My current (and always evolving) checklist is:

Organization

  • One repository
  • Separate code and data

Documentation

  • Specify run environment
  • Specify dependencies
  • Create a project README
  • Create a data dictionary

Automation

  • Create a master script
  • Create relative paths
  • Use a container technology

Dissemination

  • Specify a license
  • Publish your repository / container with a persistent identifier
3 Likes

Refactoring is one of those things that I think is important for code maintainability and reusability, but isn’t really discussed or taught. Some of the obstacles to implementation that I’ve been thinking about:

(1) concern about breaking working code; this could be addressed by better training and implementation of practices around version control

(2) unclear reward for the effort; dedicated time for code review could help with this - another person’s perspective can point out where code can be improved (or maybe even you swap and refactor someone else’s work) - and setting aside time elevates the task to something that is valuable and should be incorporated into regular practice.

2 Likes

@jules32

it really takes this culture of shared practices

yep - always culture/people + tech

1 Like

@noamross I really like the file tree! How do you generate it? It’s something I’ve thought about doing a few times, but I haven’t actually done it.

1 Like

I use the tree command to generate it initially, pipe it to a file then edit manually. I’ve long thought of making something that makes updating/editing it easier, but haven’t had the time.

5 Likes

A thing I notice in checklists here, as well as those we use for RO package review, is the challenge of coming up with a general approach for reviewing whether the code is doing the thing it is supposed to. In theory, individual projects should have unit tests for this, but it is hard for a reviewer to know whether they should trust the unit tests, and hard for the author to write tests for errors they don’t anticipate. “Coverage” statistics are of limited value.

I don’t have a great solution for this, but I guess I want to make sure that any checklist has something like, “Are you confident the code implements the methods as described?” It’s a big job for the reviewer but it’s the central point, and one that can get a bit lost amongst all lists of best practices.

2 Likes

THIS very much. I would love to have code review in place in my department (~90% of us code in R, though we use various approaches). But the very first thing anyone says when I bring it up is “who has the time for that?” Everyone is understandably concerned with getting projects to collaborators, etc and formal code review is somewhat of a foreign concept, so any type of review we have is basically ad hoc if someone hits a snag. Convincing enough folks to value code review enough to dedicate the time to it - often looking at someone else’s complex data wrangling code from a large multicenter study - is a big hurdle.

5 Likes

I’m in the midst of trying to get some type of code review going in my department, so I have some thoughts on this topic. I work in a group where (almost) all of our projects are conducted completely independently, and people program in R, SAS, Stata, and possibly other programs. Additionally, most data cannot (strictly) be shared even among members of our group due to HIPAA issues.

I believe the comments about the necessity of a culture of shared values surrounding coding are spot on, and in trying to get code review up and running in my group I have encountered some challenges related to this. The most common arguments I hear are a) it will be too time consuming and b) my code isn’t good enough to share.

Instead of peer-to-peer code review, which was facing too much resistance, I recently started a “Coding Workshop”. At regular intervals, someone submits a piece of code for review and a reviewer is assigned. The code is also made available to the rest of the group in advance. Then in the meeting, the code author reviews the point of the code briefly, and the reviewer goes over their comments and there is general discussion. The main points we have asked reviewers to focus on, since no data sharing is involved and therefore the code can’t be run, are:

  1. Is the code well commented?
  2. Is there repeated code that could be eliminated through use of a function/macro?
  3. Are numbers used in tables and reports generated automatically to avoid typos or copy/paste errors?
  4. Are there any functions or methods you know of that could improve efficiency?
  5. Is the code readable?
  6. Was there something you learned from reading the code that you would use in the future?

While this process will not catch errors in code that is producing results for manuscripts, the hope is that it will a) help everyone learn better programming principles and b) make people more comfortable sharing their code so that we can ultimately implement a more rigorous code peer review process.

Clearly we are a group in very early stages of thinking about best practices and some sort of shared standards across our very independent work, but I have gotten a lot of positive feedback on Coding Workshop after the first couple of meetings, so I hope it will be successful in leading to more rigorous efforts down the line.

5 Likes

Hi there,

And thank you for the interesting discussion!

I wanted to share some steps we follow on the Scientific Computing Team at NCEAS when we work with scientists in archiving their products, which often consist in set of data inputs, scripts and outputs (data or/and figures).

  1. We try to run the code; yep, sounds trivial but often already let us check if we have all the necessary libraries and sourced codes, such as scripts containing custom functions. Moreover, I think the most important check this step does is data access. To be able to run an analytical scripts, you need to have access to all the input files, which can be problematic under the scenario of external review, especially if you process large datasets that might already be the results of a data collation effort. It can also be problematic for internal review, if you do not have a centralized way of managing your data (e.g. on a server with shared directories)
  2. Once we can run the script(s), we check that we get the same results as the output files that were provided to us. If it goes well, we move on to the next step; otherwise we start a discussion with the scientists. Mainly it checks potential version issues (both in script or data) and also runtime environment differences (pretty rare in our case, as we often set up our collaborators on our analytical server).
  3. Then we start to look at the code into more details; but I would not say we do a in depth review of the codes, as some are very specialized using complex models (which probably raise the question on how to clearly scope the code review process for reviewers). So far for this archiving step, we have mainly focused on improving code commenting to make sure others scientists can understand well what is going on in the code. We also ask our scientists to describe well their workflow when there are several parts/scripts to their analysis (still looking for the best tool to do so!).
  4. Finally, and this is a work in progress, we would like help scientists to modify their code from reading data from local file systems to directly pulling data from repositories. This was one motivation to start developing the metajam R package (https://nceas.github.io/metajam/) with @isteves and Mitchell Maier; aiming to provide simple functions to do so. We are not quite there yet and it might be outside the scope of this discussion, although with the growing requirement of archiving data with publication, it might be an interesting recommendation to facilitate the code review process.

Some other thoughts – to me it seems that code optimization is different from code review (as asked here). Refactoring your code to make it modular, reusable and or profiling it to make it more efficient is often a hard sell to scientists, especially when they are “done” with their analysis. I agree that working on training and recommendations on how to structure your projects (I need to check some of the propositions in this thread!!) seems the way to go and that these changes would be hard to achieve via code review. This being said, I think that an output of the review process should be to define and set up unit tests on the scripts that have been reviewed; this would be a good way to check on further developments/improvements.

I hope this is useful!

Julien

3 Likes

I hope this is useful!

Fantastically useful. Thank you for the clear layout of your approach @brunj7.
We’ll be arranging the details of this community call soon.

A bit off topic, but I am giving a lot of thoughts these days about how to move towards having journal editors require the submission of code, along with a paper and its data.

My experience is aligned with that of @jenniferthompson and @zabore, but much worse: my supervisor does not use R; nobody in my lab does any code review (even though most students use R); PIs only look at the results of analyses and question the statistical methods used, but never the code to achieve those; there is never any refactoring done by anyone-the sole idea of it would surprise everybody I work with; even basic code formatting is all over the place, so forget about writing descent code using functional programming to replace copy-paste or crazy loops…; absolute paths, setwd(), and other forms of non portable code cripple everybody’s scripts; nobody uses GitHub or even version control. Our culture is so far away from anything acceptable on this front that it will take years and years to get to any reasonable place. And that is why I feel that things will only start to really change when there will be pressure from higher up (meaning the journals) to provide code. Until then, people don’t know, don’t care, don’t have the time, don’t have the incentive, don’t give any thought to the subject of writing readable, portable, and reviewed code.

Reading some of the posts in this thread, I was impressed to see that in other labs, things are much further ahead. But I think that my lab is, unfortunately, more representative of a classic university research lab. There is a lot to do. And things are often so bad that doing it from the ground up seems unrealistic. And a top down incentive seems to me to be the only way to shake things up. It could also be a way to impose some form of norm. But I have no idea how to walk towards this goal.

2 Likes

(I am aware that my post is very naive and I am extremely thankful and excited to read all the compendiums and other great links on this thread, and papers that were published on the importance of code publication. But all of this feels like bottom up grind work and I am pessimistic about when this might reach over to my lab and countless others like it. That’s why I would love to hear about approaches to reach out to journal editors or funding agencies like NSF-things that are BIG incentives to research labs and make things change on the large scale. In Canada, the 3 main funding bodies (the Tri-Agency), recently made a wonderful move towards open access and that is really making things change over here (not about code however). But maybe a lot of this sort of bottom work needs to be done before big funding agencies or journals can be convinced to set policies that will then force the generalization of these better practices to a much wider community of researchers?).

But I am getting more and more off-topic. Sorry about that.

2 Likes

Save the date :spiral_calendar:!
Community Call on this topic takes place Tuesday, October 16, 2018, 9-10AM Pacific (find your timezone)

Agenda:

  • Stefanie Butland @stefanie - welcome, logistics, introduce presenters (5min)
  • Carl Boettiger @cboettig , moderator and presenter (10 min)
  • Melanie Frazier (possibly tag-teaming with Julia Stewart-Lowndes @jules32) (10 min)
  • Hao Ye @hye (10 min)
  • Q&A (20 min)

More details to come very soon

2 Likes

@prosoitos - I feel your pain, and I have also encountered labs where spending time on code review or refactoring would be scoffed at. I also agree that a big lever here is for funding agencies and journals to be involved in promoting better practices. Unfortunately, I think it needs to be more than just requirements for code sharing, because that doesn’t address standards or enforcement. My hope is that funding agencies see the need to implement both requirements and support training for entire research labs, since I imagine there are plenty of places that would be interested in improving practices, but can’t overcome the barrier of changing on their own.

(And if you want to chat more, feel free to reach out via private channels.)

2 Likes

You are completely right about the leverage that journal editors and granting agencies have. Journal policies have been a very useful tool in similar areas, such as expanding data publication requirements. In my field, ecology, the adoption of preprint and data-deposition policies occurred in the major journals occurred largely in the past 10 years. We slowly see this happening with code, too - partial policies like code upon request (example from Nature) are useful. They give reviewers the tools to request code and start to push for standards that eventually can make their way up to policy. I pretty much always make such requests if the journal has such a policy and attempt to reproduce results, and I know this provides a pretty powerful incentive for the authors! (This can also annoy the authors a great deal, so it’s important to be helpful and constructive when reviewing the results so that they appreciate the feedback.)

If you want an example of lobbying effort, I sent this letter regarding data access and preprints to the editor in chief of a journal in my field about three years ago, and 80% of the recommendations were adopted. This was accompanied by some personal lobbying, which is the pattern I’ve seen with other journals - a few private and public letters plus some conversations with colleagues at a conference can go a long way. I imagine enough places have adopted minimal code-sharing policies now that they could be used as examples. Most editors are eager emulate the policies of what are perceived as prestige or competitor journals, so when a big campaign pushes a Nature to change policies, it makes it much easier to leverage that to lobby for policies in more niche publications.

3 Likes

Wow. This is fantastic. Thank you!

Your lobbying efforts are really great and a beautiful example of how to have an impact at the individual level. This really answers a lot of my questions and is extremely inspirational. Thank you very much for sharing!

1 Like

In addition to requesting / requiring code in relevant submissions, one could also imagine journals recruiting reviewers specifically to evaluate code / reproducibility.

While it is obviously not reasonable to expect such a reviewer to exhaustively evaluate the validity of large and complicated software, there are some very basic and easy things that could be checked with minimal effort.

For example, I recently participated in a three day reproducibility workshop at NIH which attempted to teach researchers about the principles and best practices of reproducibility by reproducing ~10 bioinformatics / genomics papers which appeared to have all of the information / code necessary to be easily reproduced. Within 2 minutes of looking at the RMarkdown code for the very first paper, it was obvious that it had no hope of ever being run (referenced variables not defined anywhere in the file).

Some things that would be easy to check:

  • Are all variables defined?
  • Is package information captured?
  • Is the code documented at some minimal level?
  • Does the file reference directories / data that are unique to some user’s system? (e.g. setwd("/home/jsmith"))

In about ten minutes of downloading a script / software and attempting to get it running, you could at least make sure it passes these minimum requirements.

3 Likes

Seems like these kind of checks could be fairly easily bundled into a package. Does this functionality already exist (in devtools or elsewhere)?

I imagine something like

reprod_check <- check_reprod('myscript.R')

That would return lines containing undeclared variables, setwd() calls, etc.

I suppose the alternative is to just encourage researchers to bundle their code into packages to accompany publications, thereby addressing the documentation issues, calls to libraries, etc. but that might be an unrealistic ask…

1 Like

one could also imagine journals recruiting reviewers specifically to evaluate code / reproducibility

rOpenSci actually has a collaboration with Methods in Ecology and Evolution (MEE). Publications destined for MEE that include the development of a scientific R package now have the option of a joint review process whereby the R package is reviewed by rOpenSci, followed by fast-tracked review of the manuscript by MEE. Authors opting for this process will be recognized via a mark on both web and print versions of their paper.

Described here: rOpenSci | Announcing a New rOpenSci Software Review Collaboration

In this case, rOpenSci manages the package review process so it’s not the journals recruiting reviewers, but it’s a good start

1 Like

Details of this Tues Oct 16 Commmunity Call including how to join: https://ropensci.org/blog/2018/10/05/commcall-oct2018/

Pass it on!

1 Like