Taxize: Get rank of lowest common taxon


#21

Thanks much! Sorry haven’t gotten to the pull request yet, will do so very soon

Thanks for adding the rank name test

Will have a look at that warning vs. stop thing


#22

No rush from my end. Holler if I can help.

I didn’t submit a pull request for the rank name validation; it’s pushed to my fork, but I didn’t know if piling multiple pull requests on top of one another was a good idea.


#23

makes sense to add commits to the PR you sent if related. If not related, definitely send as a different PR


#24

#1

that makes sense to allow calling classification() outside of the function first, then passing the results in to lowest_common() - I’ll think about how to make that as smooth as possible.

#2

where is that code in your PR?

#3

Not off the top, but I’ll think about it.

We can make this work, just need to tweak the internals a bit

Thanks for pointing that out, will have a look


#25

Regarding adding a check commit to the pull request: Looks like it’s already there? https://github.com/ropensci/taxize/pull/509/commits

#2: line 39 (low_rank = NULL); line 50:55.


#26

@jimmyodonnell Seems like the results that spit out should include the taxon name itself as well as the rank of that name (if not known then that replacement you’ve put in), Agree?


#27

I’m confused – Do you mean when a low_rank option is forced? I suppose I didn’t think that was necessary since it was supplied, but maybe in some cases it’d be useful?


#28

As far as is possible, it’s best if functions always return the same structure, whether it’s a single character string, a vector, a data.frame, etc. With the low_rank option it returns a single character vector, while not using it returns a data.frame., e,.g

             name      rank     id
16 Epidendroideae subfamily 158332

#29

Ah; roger that. I see why you were looking to put them in separate functions originally. Which makes more sense to you: Two separate functions, or one whose output looks like this if the lowest common taxon is higher than the specified level:

>lowest_common(getuid(c("Humulus lupulus", "Homo sapiens")), low_rank = "family")
  name   rank  id
1   NA family  NA

#30

is that supposed to be a taxonomic name in that data.frame that’s returned?


#31

No. The intention there is to say “Show me the name of the taxon at the family level that these taxa have in common”. Because Hops and Humans do not share the same taxon at the family level, it outputs NA.

This might seem totally pointless; for metabarcoding studies it’s pretty common to want to consolidate everything at the same taxonomic rank.

Does that make any sense?


#32

Yep, that makes sense

Also, made some changes, can see the diff here https://github.com/ropensci/taxize/commit/f145fb28eec4ffb6b06ef2de9e97209fe67077a8?w=1


#33

Looks great. As is, the valid_names object is not defined; is it added elsewhere?


#34

Not understanding, what do you mean?


#35

Sorry, I Friday afternooned – looked at just the diff, not the whole file. Everything looks fine.


#36

jimmyodonnell and sckott hello. you guys seem to be experts… or at least you seem to know these things much better than me. i have registered just a while ago because i want to learn instead of sitting home doing nothing but searching for humatrope (because that’s what i am taking right now because of health issues), so i see that you really know these things and wanted to ask if you could answer my questions that i would most likely have later. thanks