Reproducibility in R package building with travis and packrat

I’m using packrat to store the packages that my package depends on, as you can see in a project I’m currently working on: https://github.com/benmarwick/ktc11

I’m also using Travis-ci to alert me if I make breaking changes to my package. But each time travis builds my package, it gets the packages that I depend on from CRAN, rather than my local packrat directory. So this means I might get different package versions in my Travis build compared to my local build, which I want to avoid.

How can I configure travis (ideally with the .travis.yml) to get the package sources from my local packrat directory rather than CRAN?

It seems like this has been achieved by richfitz/wood, with this in his .travis.yml file:

env:
 USE_PACKRAT=1

and a fairly complex make/packrat.mk file which makes it all work, as well as doing other things.

Is there are simpler way? I’ve cross-posted this question at stack overflow (though it’s more of a devops question than a coding question) and at the Travis-CI issue tracker. I’m hoping someone in the rOpenSci community might have already solved this problem.

1 Like

Good question @benmarwick

@maelle @richfitz @cboettig @noamross perhaps you have some thoughts on this issue?

I haven’t run into this yet, but that sure does seem complicated.

Pinging folks in slack to see if can get some help from there

After much trial and error and further reading, it seems that this will do it, with a .travis.yml file like this:

# R for travis: see documentation at https://docs.travis-ci.com/user/languages/r

language: R
sudo: false
cache: packages
install:
  - R -e "0" --args --bootstrap-packrat
warnings_are_errors: false

The key lines in the above file are:

install:
  - R -e "0" --args --bootstrap-packrat

This will start R, and builds the R packages in the local packrat directory so that they are available in the Travis machine.

After that, travis will continue and attempt to build the package, and will not need contact CRAN to get the dependencies because they are already available (assuming packrat is working as expected).

I discovered this trick here: https://travis-ci.org/ChowHub/paper-pattern-similarity/builds/127262823 and at https://github.com/rstudio/packrat/issues/158. I’ve got it working here: https://travis-ci.org/benmarwick/mjbtramp/builds/157747326

The advantage of this is we can build on travis with the exact same packages that we’re using locally. We don’t have to get the latest packages from CRAN when we build on travis, now we can have more control of the package versions that travis builds with in our project.

The disadvantage is that the build time on travis is substantially increased. One of my projects went from 2-3 mins to 13-15 mins after switching to packrat.

Cross-posted from my Q&A at http://stackoverflow.com/a/39338706/1036500

1 Like

Nice work @benmarwick - Looks pretty slick. Too bad about the build time increasing, but a big win for reproducibility.

Do you think there’s a chance the build time with packrat could be decreased?

Yes, after following up on Jim’s comment, it seems we can cache the packrat packages using cache: like this:

# R for travis: see documentation at https://docs.travis-ci.com/user/languages/r

language: R
sudo: false
cache:
  directories: $TRAVIS_BUILD_DIR/packrat/
  packages: true
install:
  - R -e "0" --args --bootstrap-packrat
warnings_are_errors: false

In my use-case, this has reduced the times substantially, back to 1-2 mins.

awesome :rocket:

will try this out when I have a packrat project