Differences between rerddap and rnoaa and questions related to ERDDAP griddap

I just discovered rOpenSci - amazing work! Would love to get involved!

I’ve been trying to download ERDDAP data for a relatively small number of lat/long locations (less than 1000) but for a relatively longer time-span, 2003-2013. I’ve been trying to use the xtractomatic package on GitHub, but have been running into some issues.

My questions are:

  1. Is it possible to extract griddap values per lat/long & date (dd/mm/yyyy) using either rerddap or rnoaa rather than downloading the whole griddap?
  2. Is there a way to search all ERDDAP griddap database by specifying a region (e.g., lat/long bounding box) and temporal bounds? Something similar to the ERDDAP > Advanced Search webpage?
  3. Is there any major differences between rerddap and rnoaa with regards to ERDDAP?

Many thanks!

1 Like

Hi @GodinA Thanks for getting in touch.

  • Yes, with both, but both packages are might still be a bit buggy here and there with ERDDAP data since I’m still working on making the functions more robust. For example, with rnoaa
(res <- erddap_grid("jplCcmp35aWindMonthly",
                    time = c('2010-01-01','2011-01-01'),
                    latitude = c(59, 60),
                    longitude = c(304, 306)))

#> <NOAA ERDDAP griddap> jplCcmp35aWindMonthly
#>    Path: [~/.rnoaa/erddap/c455a41b29782f8ff88f1bc93104daf1.csv]
#>    Last updated: [2015-03-17 14:13:13]
#>    File size:    [0.05 MB]
#>    Dimensions:   [585 X 9]
#> 
#>                    time latitude longitude nobs    upstr      uwnd      vpstr      vwnd     wspd
#> 1  2010-01-01T00:00:00Z   59.125   304.125   85 30.67204 0.7263626 -104.55960 -5.807849 13.65364
#> 2  2010-01-01T00:00:00Z   59.125   304.375   86 26.64347 0.4440578  -99.24922 -5.452298 13.67652
#> 3  2010-01-01T00:00:00Z   59.125   304.625   87 23.19477 0.2487334  -96.53299 -5.269181 13.84133
#> 4  2010-01-01T00:00:00Z   59.125   304.875   91 21.85192 0.2212659  -87.52975 -4.773241 13.93174
#> 5  2010-01-01T00:00:00Z   59.125   305.125   90 23.19477 0.3158762  -87.13300 -4.732039 13.94433
#> 6  2010-01-01T00:00:00Z   59.125   305.375   89 21.69932 0.2517854  -79.77782 -4.335287 13.85735
#> 7  2010-01-01T00:00:00Z   59.125   305.625   86 21.63828 0.2182140  -87.25508 -4.655741 13.86422
#> 8  2010-01-01T00:00:00Z   59.125   305.875   83 22.30971 0.1907465  -79.71677 -4.222364 13.75320
#> 9  2010-01-01T00:00:00Z   59.125   306.125   84 24.23243 0.3418177  -78.12977 -4.191845 13.72116
#> 10 2010-01-01T00:00:00Z   59.375   304.125   86 26.09412 0.4471098 -101.75182 -5.635415 13.58726
#> ..                  ...      ...       ...  ...      ...       ...        ...       ...      ...

Is that what you had in mind?

  • No, I don’t think we support advanced searches yet. I just played around with those, and looks like we can - opened an issue here https://github.com/ropensci/rerddap/issues/7 for rerddap

  • rerddap is meant to be a general purpose client for working with ERDDAP servers, including tabledap and griddap data - then users can use this directly, and package developers could use it do develop another package. rnoaa is a client for many different NOAA data services (as listed in the README for rnoaa), of which data from erddap servers is a subset of the functions avail. in that package.

Eventually, once rerddap is more stable, i will just use that package in rnoaa to do anything with ERDDAP servers.

So we can debug things if problems come up - what versions of each package do you have?

Cool. There’s lots of ways to get involved: submit a package you already work on to be in the suite, or contribute to an existing package - maybe even rnoaa or rerddap :smile:

One more thing: Right now, we download data from ERDDAP servers in .csv format. I realize that this is not ideal, and perhaps .ncdf would be better (better compression, faster download times), but a separate package dependency is needed to parse the ncdf binary file format. BUT, I am playing with that now, so hopefully will be in rnoaa and rerddap soon.

Thank you @sckott! This clarifies things. Will play with these a bit more. Will let you know how it goes!

@GodinA Great, looking forward to your feedback.

Hi @sckott,

So I managed to get my data! However, I ended-up using Xtractomatic from Roy Mendelssohn (on GitHub). We had some good discussion and here are my thoughts:

  1. They implemented a function called searchData which essentially does search all available datasets included in Xtractomatic by using some keywords. For example: searchData(searchList = list(list("varname","upwelling"))) will give you a list of datasets that contains ‘upwelling’ in their varname. However when you are not familiar with these datasets, you end-up looking-up all of them (which right now is less than 155), so not too long to do. Something similar in rerddap and rnoaa would be useful (especially if all ERDDAP datasets can be accessed).

  2. I tried to download ERDDAP data using rerddap but because I needed to extract lots of spatial/temporal information, this was taking way to long with the current csv files (but I know that’s on your radar).

Would love to get involved! I think these tools are so essential! However, I never worked on developing R packages before… Any tips on where/how I should start?

Thanks!

Hi,
Sounds good. Though can you share an actual search query you did with xtractomatic? Or is that one: searchData(searchList = list(list("varname","upwelling")))? ed_search() in rerddap does search, but I guess it doesn’t use the advanced search features, should be in there soon.

Glad to know you need more speed! Good motivation to work on the netcdf stuff.

In terms of getting involved, the book http://r-pkgs.had.co.nz/ is a great place to go. And @hilary has a popular post on building pkgs http://hilaryparker.com/2014/04/29/writing-an-r-package-from-scratch/

Oh great I haven’t looked in much detailed at the ed_search() function. Yes, searchData(searchList = list(list("varname","upwelling"))) is one-way of searching but you can also specify further keywords by adding lists. Not the most straightforward way of searching (I think they are working on it though). It would be fun to be able to search with logical operator and specify the dates/time frame and region.
Thanks for the tips on getting started with building pkgs!

Will work on incorporating that.

Hi again @GodinA

I added advanced search. try:

devtools::install_github("ropensci/rerddap")
library("rerddap")
ed_search_adv(variableName = 'upwelling')
10 results, showing first 20 
Source: local data frame [10 x 2]

                                                               title       dataset_id
1    Wind Stress, METOP ASCAT, Global, Near Real Time (1 Day Composite)  erdQAstress1day
2   Wind Stress, METOP ASCAT, Global, Near Real Time (14 Day Composite) erdQAstress14day
3    Wind Stress, METOP ASCAT, Global, Near Real Time (3 Day Composite)  erdQAstress3day
4    Wind Stress, METOP ASCAT, Global, Near Real Time (8 Day Composite)  erdQAstress8day
5  Wind Stress, METOP ASCAT, Global, Near Real Time (Monthly Composite)  erdQAstressmday
6      Wind Stress, QuikSCAT, Global, Science Quality (1 Day Composite)  erdQSstress1day
7     Wind Stress, QuikSCAT, Global, Science Quality (14 Day Composite) erdQSstress14day
8      Wind Stress, QuikSCAT, Global, Science Quality (3 Day Composite)  erdQSstress3day
9      Wind Stress, QuikSCAT, Global, Science Quality (8 Day Composite)  erdQSstress8day
10   Wind Stress, QuikSCAT, Global, Science Quality (Monthly Composite)  erdQSstressmday

Try it out and let me know if it’s working for you

Great! That works well @sckott Thank you!

1 Like

Hi I am trying to run your example:

library(rnoaa)
library(ncdf4)

options(noaakey = "mypass")

(res <- erddap_grid("jplCcmp35aWindMonthly",
                    time = c('2010-01-01','2011-01-01'),
                    latitude = c(59, 60),
                    longitude = c(304, 306)))

But I obtain only:

Error in erddap_grid("jplCcmp35aWindMonthly", time = c("2010-01-01", "2011-01-01"),  : 
  unused arguments ("jplCcmp35aWindMonthly", time = c("2010-01-01", "2011-01-01"), latitude = c(59, 60), longitude = c(304, 306))

How can I fix it? thanks in advance!

Thanks for the question @milomilo

The ERDDAP functionality in rnoaa was moved into rerddap. Try

install.packages("rerddap") # if not installed already
library("rerddap")
(res <- rerddap::griddap("jplCcmp35aWindMonthly",
                    time = c('2010-01-01','2011-01-01'),
                    latitude = c(59, 60),
                    longitude = c(304, 306)))

#> <ERDDAP griddap> jplCcmp35aWindMonthly
#>    Path: [~/.rerddap/a6de8c4278732cdb48dec1e35651a312.nc]
#>    Last updated: [2016-02-12 08:41:49]
#>    File size:    [0.02 mb]
#>    Dimensions (dims/vars):   [3 X 6]
#>    Dim names: time, latitude, longitude
#>    Variable names: number of observations, u-component of pseudostress at 10 meters, u-wind at 10 meters, v-component of pseudostress at 10 meters, v-wind at 10 meters, wind speed at 10 meters
#>    data.frame (rows/columns):   [585 X 9]
#>                    time    lat     lon nobs    upstr      uwnd      vpstr      vwnd     wspd
#> 1  2010-01-01T00:00:00Z 59.125 304.125   85 30.67204 0.7263626 -104.55960 -5.807849 13.65364
#> 2  2010-01-01T00:00:00Z 59.125 304.375   86 26.64347 0.4440579  -99.24922 -5.452298 13.67653
#> 3  2010-01-01T00:00:00Z 59.125 304.625   87 23.19477 0.2487334  -96.53299 -5.269181 13.84133
#> 4  2010-01-01T00:00:00Z 59.125 304.875   91 21.85192 0.2212659  -87.52975 -4.773241 13.93174
#> 5  2010-01-01T00:00:00Z 59.125 305.125   90 23.19477 0.3158762  -87.13300 -4.732039 13.94433
#> 6  2010-01-01T00:00:00Z 59.125 305.375   89 21.69932 0.2517854  -79.77782 -4.335287 13.85735
#> 7  2010-01-01T00:00:00Z 59.125 305.625   86 21.63828 0.2182140  -87.25508 -4.655741 13.86422
#> 8  2010-01-01T00:00:00Z 59.125 305.875   83 22.30971 0.1907465  -79.71677 -4.222364 13.75321
#> 9  2010-01-01T00:00:00Z 59.125 306.125   84 24.23244 0.3418177  -78.12977 -4.191845 13.72116
#> 10 2010-01-01T00:00:00Z 59.375 304.125   86 26.09412 0.4471098 -101.75182 -5.635415 13.58726
#> ..                  ...    ...     ...  ...      ...       ...        ...       ...      ...

Hi sckott, I was in mission with university so I checked only now and… is working!
Thanks a lot!

By the way a new question: I was using xctractomatic before, and it was quite good because it extract the wind at the animal position without computing by myself.
The problems is that the dataset in xctractomatic are quite bad.
There is some function in rerddap that do the same thing? (Interpolate spatially and temporally in the grid on the animal position?)
Thank you in advance.

can you show an example? griddap I thought did do the interpolation. does it not?

Here a minimal sketch of my previous code.
As you can see I am associating to animal position the wind.

fileslist<- list('Berlenga.csv')

require(xtractomatic)
for (files in fileslist){
  print(files)
  (data<-read.csv(files, header=TRUE))
  
  Listaring<-unique(data$RING)
  Listaviaggio<-unique(data$VIAGGIO)
  
  quali=0
  vlist<-list()
  h=1
  for(i in 1: length(Listaring)){
    for(j in 1: length(Listaviaggio)){
      
      quali<- which(data$RING==Listaring[i] & data$VIAGGIO==Listaviaggio[j])
      if (length(quali) > 200){
        print(length(quali))
        Eposlist=data$E[quali]
        Nposlist=data$N[quali]
        
        ventolist <- xtracto(Eposlist, Nposlist ,ndates2,    "erdQAxwind1day",xlen=.2,ylen=.2)
        xventomlist<- ventolist$mean
        xventostdevlist<-ventolist$stdev
        nlist<-list(ring=ring,viaggio=Listaviaggio[i], Epos=Eposlist,
                    Npos=Nposlist, xventom=xventomlist, xventostd=xventostdevlist)
        vlist[[length(vlist)+1]]<-nlist
      }
    }
  }
}

(hope you dont mind but I edited the code a bit to read easier :wink:)

Can you share that csv file, email me maybe? I won’t share with anyone? Or a dummy csv file?

Hi sckott, in the previous mail a minimal sketch of my files.
The trajectories are really much longer!
Please do not publish it!

just some advice.
I want to assign a weight on my wind data, for example temporal and
spatial variability.
In the dataset I am using this values are not present.
Do you think there are some smart way of doing it with rerddap?
Or I should download everything and perform the analysis manually?

Really, thanks a lot for you help.