Community Call on Testing, Dec 5, 2019 - Tell us what you wish your past self knew

Tags: #<Tag:0x00007f950715b500> #<Tag:0x00007f950715b320>

Join our Community Call: “Last Night, Testing Saved my Life”

Thursday, December 5, 10-11 AM PDT / 6-7 PM GMT with speakers Steffi LaZerte and Rich FitzJohn.

To the uninitiated, software testing may seem variously boring, daunting or bogged down in obscure terminology. However, it has the potential to be enormously useful for people developing software at any level of expertise, and can often be put into practice with relatively little effort. We are aiming to address needs on a continuum, whether you are getting started with testing, or it’s already at the core of your development process.

What’s your story? Help us build a rich discussion by telling us below.

  • What motivated you to start testing?
  • What do you wish your past-self knew?
  • What are your favorite resources on the topic?
  • What questions do you have?

See the announcement blog post for Community Call speaker bios, presentation topics and how to join the call.

2 Likes

What do you wish your past-self knew?

It’s much easier to add tests as you go than to graft them on later! Mostly (but not only) because constantly iterating between developing code and tests actually changes the way you write the code.

Discussion question: Once you’ve got a project that’s pretty, um “mature” and has no tests … now what? What are ways to improve the situation that have a good pain:payoff ratio?

3 Likes

What do you wish your past-self knew?

During grad school, I worked on a large hierarchical model where (1) the output was stochastic, and (2) the MCMC needed to run for several hours before any hope of convergence. Quick unit tests seemed insufficient to assert the correctness of the model and its implementation, so I was reluctant. But I was still running a bunch of quick checks manually: did the MCMC return the expected number of posterior samples? Were there any NAs or NaNs? Was the acceptance rate greater than 0? If I had automated these checks, development would have been much faster and smoother. If I were to implement the project all over again, I would maintain a fast built-in unit test suite and a slower (but more thorough) external validation suite.

3 Likes

Motivated: early testthat sounded like something I should do, the live-testing while working on a package got me hooked

Past self: covering everything with detail is much less important than covering almost everything with simple tests. Tests give you a solid foundation to change code, you have something to base changes on since you recorded what it did in the past. I resisted putting tests on projects that weren’t finished or heavily in-development, that was a mistake - having tests for early versions is very helpful.

Resources: sadly I haven’t read much on testthat, I just use it naively (really should read its doc!). Books I read a long time ago and loved that are kind of related include The Pragmatic Programmer and Dreaming in Code.

Qs: so many of my tests require numeric comparisons, but I get in trouble when different tool chains return slightly different numbers - I’m not sure how to allow that kind of flexibility ( expect_equivalent() is not enough)

2 Likes

Here is what I wish my past (and present self) would know :slight_smile: . The tests in my package take too long, so I naturally included skip_on_cran() and skip_on_travis() to my test files. This created a problem that the tests in my package are not captured by covr and codecov.io . Any ideas on how to solve this?

1 Like

Thanks for posting that here @rafapereirabr. If you don’t get an answer sooner, you can ask during the Call. We use a google doc for collaborative note-taking and we ask people to type their questions there to be asked during Q&A

My work was mostly performed in a pharmaceutical environment. The risk for patients by wrong calculation results needed to be excluded by testing.

I found out that testing not only makes my software safer and more reproducible, it also enables me to develop faster. Whenever I need to add features to my packages, I can immediately see if those are backwards compatible.

My past-self should have found out about testthat way earlier. It is so easy to write some basic tests there, that already save time. Building the package RTest to get nicer reports of tests, showed to me, that testing is even more important, when it comes to test-reports. RTest has now 98% code-coverage, as you can never avoid typos/mistakes in all lines.

My question to the others would be, if they found any nice resource that allows really fast testing. Travis is way to slow for my purposes. I found drone a pretty nice solution, but it requires building your own docker image. Any suggestions?

1 Like

I mostly am concerned about data validation. A good example came up recently where I was doing some geoprocessing and incorrectly matched the processed geometries back to their features. I only caught it because I manually plotted some before and afters. Now the code generates some test maps to look at so that I can see if before and after don’t line up, but if I had not thought to look I wouldn’t have caught it.

So, what I would love to know more about is a structured way of thinking about testing from a data validation perspective. I use the assertthat package for some stuff, but my approach is not as principled/organized as I would like. I’ve been toying with the idea of writing tests for what I imagine my final data product to be like before I do any data processing, and then adding to them as I go, so that I have a fairly comprehensive set of tests on the final data. Maybe something like TDD but for data. Plus then you could call it “research in 3D”.

1 Like

So glad you posted here @potterzot. Loved your shoutout to testing in your (upcoming) blog post.

Thank you! I’ve got the call on my calendar so hoping to be able to make it barring something coming up.

1 Like

There’s a neat blog called “Test-Driven Data Analysis”: http://www.tdda.info . The authors have built an associated Python package with data-testing tools based on their framework. I’d say the assertr package, or validate package and its associated universe (https://github.com/data-cleaning/), are the R package expressions of that philosophy.

1 Like

Here are some resources on testing. What are we missing? Add your favourites.

Video, collaborative notes, speakers’ slides from this Community Call are posted https://ropensci.org/commcalls/2019-12-05/