Hello,
I’m trying to retrieve a list of sequences from a list of ID with entrez.
My input file looks like this:
query
1 CBS 119047 AND Botryosphaeriaceae[ORGN]
2 CBS 119048 AND Botryosphaeriaceae[ORGN]
3 CBS 119935 AND Botryosphaeriaceae[ORGN]
4 CBS 113190 AND Botryosphaeriaceae[ORGN]
And my function is this one :
id = list()
for (i in 1:nrow(dataNCBI)) {
test2 <- entrez_search(db="nucleotide", term = dataNCBI$query[i], retmax= 40)
ids5 <- data.frame(dataNCBI$query[i],test2$ids)
id[[i]] <- ids5
big_data2 = do.call(rbind, id)
}
And my output file:
dataNCBI.query.i. test2.ids
97 CBS 119048 AND Botryosphaeriaceae[ORGN] 14279559
98 CBS 119048 AND Botryosphaeriaceae[ORGN] 14279558
So far soo good, However my code only works if my query have a result. When it reach the first value without a result it stops. I would like to create a loop to avoid this problem and obtain something like this:
dataNCBI.query.i. test2.ids
97 CBS 119048 AND Botryosphaeriaceae[ORGN] 14279559
98 CBS 119048 AND Botryosphaeriaceae[ORGN] 14279558
99 CBS 113190 AND Botryosphaeriaceae[ORGN] No itens found.
100 CBS 116741 AND Botryosphaeriaceae[ORGN] 51094092
Any tips how to solve this problem?
Best regards,
Eduardo