Testing Infrastructure

rOpenSci currently recommends the testthat package for software testing, and no other package. This is perhaps overlay prescriptive, and could discourage people from implementing or using entirely different, new, better, or simply other testing systems. This leads to the following question for which we are seeking discussion here:

  • If we are to expand our recommendation to list alternative testing systems, what conditions should any potentially additional systems satisfy?

Being a scientific endeavour, we would of course like to approach an answer empirically, for which the following statistics provide context, summarising all identifiable testing frameworks used in CRAN packages (noting that all repositories are GitHub, and no packages use alternative hosting platforms, so not being prescriptive there, just simplifying the data):

Testing package Number of packages using GitHub Latest CRAN update
testthat 6751 :heavy_check_mark: 2021-12
RUnit 206 :heavy_multiplication_x: 2018-05
tinytest 151 :heavy_check_mark: 2021-07
testit 28 :heavy_check_mark: 2021-04
svUnit 12 :heavy_check_mark: 2021-04
scriptests 4 :heavy_multiplication_x: 2016-07
unitizer 3 :heavy_check_mark: 2022-01
unittest 2 :heavy_check_mark: 2019-11

That might suggest something like a requirement that recommended testing frameworks / packages should:

  • Be regularly maintained (= latest update no older than 1 year)
  • Have public repositories
  • Be used by at least 10 other packages

That would lead to expanding our current recommendation of testthat only to also include:

But then a final note may be taken from the testit README:

There is no plan to add new features or reinvent anything in this package

So maybe that package could be justifiably excluded, leaving only - tinytest and svUnit. Thoughts please?

3 Likes

I think the wording as it stands is fine:

We recommend using testthat for writing tests. Strive to write tests as you write each new function. This serves the obvious need to have proper testing for the package, but allows you to think about various ways in which a function can fail, and to defensively code against those. More information.

That certainly would not prevent submission of a package using a different testing framework. From what I can see tinytest is essentially a clone of the testthat features by people who hate dependencies (but like people depending on their packages :thinking:), and svUnit iterates on the ideas in RUnit, which is very much in the spirit of testthat. It’s hard to imagine that the review process, given the wording above, would see any issues with using these packages given they do basically the same thing.

As these packages do basically do the same thing, there’s an advantage in not having people do weird things for the sake of being different (e.g., a homerolled test framework in order to truely have zero dependencies) because that raises barriers to contribution and maintenance. So from that point of view, recommending/highlighting the one that is by far the most commonly used has an advantage.

More interesting would be for people to start playing around with things like property testing, e.g. quickcheck or similar. For programs with special needs (e.g. shiny, http requests, graphs) we already have recommendations that highlight where genuinely new features over standard test frameworks might be found useful (shinytest, httptest{2}, vdiffr, etc).

4 Likes

The only thing different from tinytest, is that tests are kept on inst folder, so available to the user. For this reason I’m considering using it on my own (new) packages, sot that I might be able to receive check information if the package doesn’t work on users’ computers (I hope it won’t happen, but…).

As the author of {svUnit}, I would give a little bit of information here. {svUnit} was written at a time where only {RUnit} and the now archived {Butler} packages existed for testing. I wanted to use {RUnit}, but its internal structure (list of lists of lists…) did not please me. So, I rewrote it with the same API, but different internal logic. Then {testthat} was written by someone that has to reinvent everything… but do it so beautifully that the majority of the useRs adhere to its products.
Personally, I consider it is better to advice using {testthat}. But it is also important to mention that diversity is and has always been a strength in the R community and that there are other frameworks available too.

4 Likes

Thanks for the response @phgrosjean. Your suggestion that you consider it “better” to advise using testthat then accords with the suggestions of @richfitz to leave things largely as they are. We might simply add an additional comment that testthat is a recommendation only, and that alternative frameworks may also be used, and include a link to this discussion. The main aim of asking here was to create a public record of discussions leading to our recommendations, so thanks to all for helping to create that!

I would like to mention , hedgehog which brings property based testing to R (and other languages).

The package is not widely in use but has a lot of potential imho, that’s why I wanted to mention it here. I started using it with simstudy: https://github.com/kgoldfeld/simstudy/blob/98c46fce3e983522a6eebe31d141db5dfea1d351/tests/testthat/test-define_data.R

Edit: somehow I skipped the paragraph where @richfitz mentioned quickcheck which is actually based on hedgehog :smile: