How should I manage transfer of an R package to a new org? Creating a legacy package?

Hello all, I have a question or two about the best ways to go about transferring an R package to a new org and how to best deal with what might be some significant user-visible changes.

I maintain the R package dplR and have done so for over a decade. It’s pretty widely used in the analysis of tree-ring data. As part of a larger effort to make several different tools available to the research community under one umbrella I’m now part of a team (OpenDendro) that will have dplR at its core. I’ll be working with some grownup programmers (not hacks like me!) and we are going to be refactoring the code and streamlining some of the functionality (e.g., make the workflow play nicely with the tidyverse). This will mean adding some new functions and deprecating some others (e.g., my terrible plotting functions which seemed like a good idea in 2007 have got to go). Over the next two years, we are also going to expand the codebase to include a Python version, some nifty Shiny apps, and much better documentation.

I know I can transfer the ownership of the repo in GitHub easily enough and change the DESCRIPTION in the next release so that the CRAN gods will be placated with a new repo URL. However there will be, I think, some changes that will affect existing users and I’m not eager to break anybody’s code.

I’d like to keep the package name the same (dplR) regardless of the changes and I’m wondering if it would be wise to make the existing package something like dplR_Legacy so that longtime users can still have an actively maintained package that works as they are expecting.

I hope that makes sense.

So, here are my questions:

  1. What the best approach is for transferring the existing package to this new organization? Just do it in GitHub under Settings/DangerZone?

  2. Should I create a legacy package on CRAN for existing users that might not want any changes? Or is it better to do something similar like create a legacy branch in Github and not publish it on CRAN?

Any help under the general heading of “what to do when you are thinking about making substantial changes to an existing package but want to keep users happy” would be greatly appreciated. As you can tell, I’m not up to speed on what the best practices are in software development and I’d like to do it right.

Thanks for reading this far.

-Andy

2 Likes

Hi @AndyBunn. rOpenSci Community Manager here. I’m sorry you had to wait so long for me to reply.

First, congratulations on what sounds like a really positive transition for the package! I’ll point you to a couple of resources and ping others who might have specific advice for you (my go-to colleagues for this are on vacation right now).

Cheers

1 Like

Yes! Then make sure you are granted admin rights to the repo, if you are not an owner of the GitHub organization. Other useful things to know

Then on to your more difficult question. :slightly_smiling_face: The list of resources by @stefanie is the same I’d come up with. I’d only a non-rOpenSci resource, the talk “Maintaining the house the tidyverse built” by Hadley Wickham.

Now, I’m not so sure I’d rename the existing package and create a new one with the same name. At least I can’t think of a similar example, but I can think of the following cases:

  • going with breaking changes (e.g. gganimate changed a ton when it changed hands, though it wasn’t on CRAN when that happened; many packages have breaking changes in their release notes);

  • allowing users to use different editions (well I can only think of testthat and I don’t think it’s a relevant solution in your case);

  • creating a new package with a new name and abandoning / maintaining the old one (e.g. dplyr and plyr).

Good luck with further development of your package! :deciduous_tree:

1 Like

You could follow the Bioconductor guidelines for such changes. In summary:

  1. Release a new version with the old code and the new code but mark as deprecated (with .Deprecated) the old code and use the new code internally if possible.
  2. Given some time, that’s up to you or your release schedule, mark as .Defunct the code
  3. On next release or after some more time, remove the old code.

On the tidyverse team they use the lifecycle package and the functions: deprecate_soft(), deprecate_warn() and deprecate_stop() for the same effect.

This way you help your users move to the new (better) code if they pay attention. While being deprecated usually only “bug fixes” on the old code.
Give some time to support and make easy to transition from old code to new code (and find bugs on new code). If the package is used a lot and there are external training materials showing the old code when new external material appear with the new interface/approach exists would be a good indication to move to defunct stage in my opinion.
Sometimes also some head ups to the coming changes are good to provide reasoning behind the changes and help downstream dependencies or users to know what changes they might expect. Two packages depend on dplR, current CRAN policies requires you to give reasonable notice of the changes, “at least 2 weeks, ideally more”, I’ve heard of 1 month notice is normal.

I think there are some examples of big refactorings on the mailing list with similar questions and answers on the Bioconductor mailing list.

2 Likes

As a PS, in CRAN Repository Policy I read a potentially relevant sentence

Introduction of packages providing back-compatibility versions of already available packages is not allowed.

1 Like