Records of unexpected species when requesting data within a loop

If run the following code for a relatively long set of species names (of lizards), it ends retrieving records of many different unrequestes organisms (plants, insects, mammals…) . Any hint on how to avoid that?

# create a receiving matrix
records=data.frame(matrix(nrow=0,ncol=3))
write.table(records, file =paste(fname,a,".csv",sep=""), sep = ",",
            row.names = FALSE, col.names = TRUE, fileEncoding = "", append=F)

Nlist=read.csv… Find file with species list below:

#loop 
  for(i in 1:length(Nlist)) {
      key <- name_backbone(name=Nlist[i], kingdom='animals')$speciesKey
      r=occ_data(taxonKey=key, hasCoordinate=TRUE, 
                 limit=limit,basisOfRecord="PRESERVED_SPECIMEN")
      FF <- as.matrix(r$data[,fields[cols]])
      
        write.table(FF, file = paste(fname,a,".csv",sep=""), sep = ",",
                  row.names = FALSE, col.names = F, fileEncoding = "", append=T)
    }# end for i
1 Like

I tried out your code.

Can you give a few examples of names in your list that end up with GBIF records with names that are not what you expect? Then I can track down why those names might be giving you the wrong data.

The strange thing is that i dont find this behavior when asking for single
species, only when i asked for them in a loop. The same happened with two
different loops.

Okay, if it only happens while in a loop that suggests that during one or more of the loop steps something is going wrong. Can you give perhaps some example output from the loop that is connected to the input name that created it, then I can go from tehre

Sure, here it is:

The error starts after Diplodactylus conspicillatus, but i tried with that
species alone and it worked fine. also several of the species in the list
didnt got any record, but that might be by slight name errors. I’m trying
to figure that out now.

Thanks!

Try this code out for debugging the problems, a little easier to see what’s going wrong:

options(stringsAsFactors = FALSE)
file <- "foobar.csv"
data_limit <- 10
num_taxa <- 100

records <- data.frame(matrix(nrow=0,ncol=4))
write.table(records, file = file, sep = ",",
            row.names = FALSE, col.names = TRUE, fileEncoding = "", append=F)

dat <- read.csv("tmaxrecords.csv")
(Nlist <- unique(dat$species)[1:num_taxa])

for(i in 1:length(Nlist)) {
  cat(i, sep = "\n")
  key <- name_backbone(name=Nlist[i], kingdom='animals')$speciesKey
  r <- occ_data(taxonKey=key, hasCoordinate=TRUE, 
             limit=data_limit, basisOfRecord="PRESERVED_SPECIMEN")
  if (!inherits(r$data, "data.frame")) {
    cat(paste0(Nlist[i], " - no records found"), sep = "\n")
  } else {
    df <- r$data[, c('name','decimalLatitude','decimalLongitude')]
    df$searched <- Nlist[i]
    FF <- as.matrix(df)
    
    write.table(FF, file = file, sep = ",",
                row.names = FALSE, col.names = FALSE, fileEncoding = "", append=TRUE)
  }
}

## see which taxa have matching name results
out <- read.csv(file) # read in results
outs <- split(out, out$X4) # split by searched taxon name
vapply(outs, function(z) identical(z$X1, z$X4), logical(1)) # which identical, which not
all(vapply(outs, function(z) identical(z$X1, z$X4), logical(1))) # all identical?

The main problem is likely:

  • Some of the taxa that don’t have matching names are actually plant taxa. e.g., Lonicera japonica is a plant, see http://api.gbif.org/v1/species/5334240 for the GBIF record for that taxon. Another example Elatostema rugosum is also a plant http://api.gbif.org/v1/species/4100369. In your search with name_backbone() you specify kingdom="animals", so that doesn’t work when they are plants. Try not using the kingdom parameter and see what you get.

Thanks a lot Sckott,

However, my problem is precisely that i am only requiring names of reptiles and instead got all those plants and animals of other groups. This is why i use kingdom=“animals”.

Okay, so which names from your listTMAX.csv file give problems? I could try to see which, but it’d be faster if you tell me

It seems that this behavior only appears when I ask for several species one
after the other. it might be related to cases in which the key for the
species is NULL,

ex.

key ← name_backbone(name=“bachia didactyla”,
kingdom=‘animals’)$speciesKey
key
NULL
r=occ_data(taxonKey=key, hasCoordinate=TRUE, limit=limit,

  •            basisOfRecord="PRESERVED_SPECIMEN")
    

This takes a while to download and gives a lot of strange data.

thanks i’ll have a look