rgbif: check for occurrence downloads already created in your GBIF account?

GBIF users: Been pondering trying to make occurrence downloads easier. One idea has something to do with checking if a download request you make is already in your account. If so, we can just download that and GBIF doesn’t have to waste resources creating it again, and you don’t have to wait as long to get data.

Discussion in rgbif repo at

You can install the development version like

remotes::install_github("ropensci/rgbif@queue-check-already-done")

An example:

# a predicate
pred_gte("elevation", 12000L)
#> <<gbif download - predicate>>
#>   > type: greaterThanOrEquals, key: ELEVATION, value(s): 12000

A normal download request would kick off a new download

occ_download(pred_gte("elevation", 12000L))

Using the new function occ_download_cached() would not be allowed to create a new download request. It has the same user interface as occ_download(), but instead matches the query predicates against the downloads in your user account, and if it finds a match returns it. Currently it returns the most recent match if there are more than 1, but thinking of just having it return all matches and let the user decide which to use.

occ_download_cached(pred_gte("elevation", 12000L))

Importantly, occ_download() and occ_download_cached() return the same thing, the download key that you can then pass down to occ_download_get().

note: I attempted to wrap the functionality of checking for existing downloads in your account into the existing functions occ_download() and occ_download_queue(), but the complexity i thought was too much, increasing maintenance pain, more errors, etc.


Any thoughts? Would you use this? If so, anything you’d like changed?

1 Like