occ_search in package rgbif

Function occ_search in package rgbif provides means for restricting the result fields that are returned by the function using the parameter “fields = c(vector of field names to by returned)”. The problem with this is, that in the first step all result fields from the gbif api (like "https://api.gbif.org/v1/occurrence/search/?scientificName=Adonis%20vernalis&country=DE) a returned to the R instance. This means

  • a big load of data is transported over the internet from the gbif api
  • this takes quite some time (up to a few minutes)
  • most of the data is then discarded, before the remaining fields of the fields-parameter are made available within the R environment.

On occ_search - getting more fields returned than just the default, but not all data · Issue #64 · ropensci/rgbif · GitHub Sckott wrote:

Unfortunately, GBIF doesn’t allow you to select the data fields you want when requesting data. The minimal parameter actually works after the data is returned into your R session. If you set minimal=FALSE then nothing is done and you get all data back, but if minimal=TRUE only a few fields are returned. This does help limit filling up your R memory minimal=TRUE as other data fields are garbage collected, but the API call to GBIF would return faster I think if they would allow us to request specific fields. So I’ve contacted them and asked for this feature. We’ll wait and see what they say.

That was in 2014.
I have studied the api docs at GBIF occurrence API as well as the docs for the rgbif package in R and have not found any hint wether in the last 6 years an improvement of the gbif api been made.
Has anybody of the package developers or users of the rgbif package since in the meantime found a more economical way of using the gbif api?
Best regards
Rudi

You’re correct that we filter the fields after data is returned. Unfortunately, this issue is out of our control on the R side of things. Only GBIF can control whether the user can select certain fields. I haven’t heard anything from GBIF abou whether they’ll allow selecting certain fields. Perhaps you can ask in the GBIF discussion forum https://discourse.gbif.org/

by the way occ_data() will be slightly faster because, although the same amount of data has to be pulled down in the request, occ_data() throws away all but the occurrence records

1 Like

Hi Scott,

thanx a lot for your reply, I had feared it would turn out this way.
As you suggested I will post this problem to the gbif forum. I really wonder if no one until now had the requirement to obtain a restricted set of occurence data “in real time” in an online application.
One common use case I have in mind is obtaining occurence records in an online mapping app within a certain radius around a given location. You would not excpect to receive a huge amount of unnessesary data that will consume time and network traffic.

I will post whatever response I get in this thread .

if you’re doing mapping, maybe you can use the maps api GBIF maps API

I’ve been reading the docs on the GBIF maps api. This is for sure the fastest way to get a visualization of occurence records as map tiles.
But this seems to have some severe limitations: I have found no way to make it possible for creating popups with additional Information about each single Observation when clicking on a point or bin in the map.
I have put the current version of my app-code on github and invited you as “collaborator” to give you access to it (see: https://github.com/Rudolf-May/shiny-floramap )
I would very much appreciate if you would like to take a quick look at the code to get an idea of what I try to achieve (for the start…)

Best regards, Rudolf

@Rudi sorry, i didn’t get a chance until now to look and looks like the invitation to that repo expired

no problem, I have launched a new invitation.
I would by happy if you would like to comment my code since I am no professional software engineer…

Nice shiny app. Maybe include in the readme what pkg dependencies are needed, and how to run the code if you’re not in rstudio, e.g, i used shiny::runApp("app.R")

  • as i said above, you might try using rgbif::occ_data() fxn, might be faster
  • the popups for each point look nice
  • might try following this example Shiny - SuperZip example to have more map space so the user can see more, with the sidebars hovering over the map
  • a note on the GBIF maps API: the data they return isn’t individual points, so you can’t do a popup for individual occurrences, BUT if you want faster visualizations w/o individual occurrence points, the maps API is the way to go
  • I know next to nothing about Shiny, so I can’t comment on that code
  • i’ve never used it, but its always best if you can test your code - perhaps try out CRAN - Package shinytest