rnoaa: multiple dates from arc2()

Tags: #<Tag:0x00007fc008725c48>

Hi all, very pleased to find the rnoaa package for accessing NOAA data.

I’m trying to access ARC2 data. What I need is a long time series for a small area. From the rnoaa::arc2() function it seems I can only specify a single date and it gives me the whole grid. Is there anyway I can:
a) only retrieve data from a small grid box?
b) retrieve data from multiple dates in one go?

Either of these would be useful, and ideally both. Alternatively, I think I would need to loop through each day, extract and subset that area I need, but this may get slow and/or intensive.



Thanks for your question @dannyparsons and for using rnoaa

I’ve opened an issue in the repository, and will investigate

Most likely outcome is that multiple dates will be easy, but I may not do the spatial filtering, and instead leave that to the user. The arc2 function is retrieving files from a NOAA ftp server; we’d have to do the spatial filtering in rnoaa; we’ll see

1 Like

Many thanks for this, I think this would be a useful feature. I know that retrieving a long time series for a small area is much more complex than a large area for a short time period because of how these data are generally stored.

I did also find that a lot of the NOAA data is available on the IRI Data Library dataset: NOAA and can be downloaded with filtering through their interface and custom links e.g.

path <- "https://iridl.ldeo.columbia.edu/SOURCES/.NOAA/.NCEP/.CPC/.FEWS/.Africa/.DAILY/.ARC2/.daily/.est_prcp/X/(30E)/(31E)/RANGEEDGES/Y/(2S)/(1S)/RANGEEDGES/data.nc"
download.file(path, "tmp.nc", method = "curl")
1 Like

Good find on the other source.

I updated the package, restart R, reinstall remotes::install_github("ropensci/rnoaa"), and see ?arc2

It now accepts multiple dates and a new parameter box to optionally filter by a bounding box. The filtering is done with sf::st_crop, it’s kinda slow; if it’s too slow you can always not use box and filter yourself

1 Like

Changed from the sf crop fxn to simply using dplyr::filter - should be quite a bit faster now