S4 data for testthat and examples

heyairf · August 29, 2024, 9:30pm

Hello,

I am designing a test suite for the first time and would love some guidance for incorporating sample data for it. This is all very new to me and I haven’t had any luck finding help in person…

The functions in my package require S4 objects for inputs so they can not be saved as .Rdata files. More specifically, the inputs are spatial, with object classes of SpatRaster and SpatVector. For the examples in the documentation I have written code that generates most of the example data on the fly instead of including raster and vector files in the inst/extdata directory to keep the total package file size low. Currently each of my functions has code in the example itself that generates an example dataset (as I saw done in the terra package examples).

Thanks to some advice to explore how other packages address this I was recommended to look at how test data is handled in the landscapemetrics package. From what I can determine (with my novice eyes) there are .R files in pkg/data-raw directory which create an example S4 object then have a usethis::use_data call (e.g. the landscape object here). There is also an associated .Rda file in pkg/data (e.g. the same landscape object). This data seems to be accessed in both the roxygen examples in the pkg/R/ .R files and within pkg/tests/testthat/helper-testthat.R file (here) with:

landscape <- terra::rast(landscapemetrics::landscape)

Is this an appropriate structure to follow for my own package example and test data? If not, I would love to learn why and also hear other suggestions!

mdsumner · August 29, 2024, 9:57pm

This is what terra::wrap() is for, I would wrap() your objects before saving them and unwrap() them (rast() or vect() also works on corresponding types).

Or, use writeRaster and writeVector to create file formats readable outside R, like GeoTIFF (GTiff) and Geopackage or FlatGeobuf.

Some file formats don’t support everything supposed by terra, or require extra sidecar files and so you will also see related discussions and helpers being created for the {targets} framework in {geotargets}.

heyairf · August 29, 2024, 11:10pm

Thanks for your reply!

The extent and resolution of the required input rasters by my package mean that saving them in a readable format (e.g. a .tif file) increases the total file size of the package significantly. This is why I transitioned to generating the mock demo data with code in the examples. Could you please elaborate on where I could use terra::wrap() in the package file structure?

It would also be very helpful to my understanding to know why the approach I described in my initial post is not appropriate.

mdsumner · August 29, 2024, 11:33pm

Ah I see, tiff will compress internally as well as rds but takes details.

All good I’ll have a closer look. The problem with saving a terra object is that it is not necessarily self-contained, it can consist of references to external files and pointers to live programming types that don’t get serialised in a round trippable way

heyairf · August 29, 2024, 11:59pm

looking closer at the landscapemetrics example in my initial post terra::wrap() is used here on the raster that is being made from scratch within the same file

Does this effectively address that problem?

Topic		Replies	Views
next steps in testdat? Package Use Questions	5	1382	April 3, 2016
Data only packages Package Development	10	4081	February 14, 2019
Building Reproducible Data Packages with DataPackageR Blog onboarding , package , reproducibility , datasharing	3	811	July 28, 2023
What if raw data in package is too large? General Q&A	4	933	February 19, 2020
DataPackageR or datastorr? Package Use Questions data-packages	1	928	April 8, 2019

S4 data for testthat and examples

Related topics