Author: David Winter
I am happy to say that the latest issue of The R Journal includes a paper describing rentrez
, the rOpenSci package for retrieving data from the National Center for Biotechnology Information (NCBI).
The NCBI is one of the most important sources of biological data. The centre provides access to information on 28 million scholarly articles through PubMed and 250 million DNA sequences through GenBank. More importantly, records in the 50 public databases maintained by the NCBI are strongly cross-referenced. As a result, it is possible to pinpoint searches using almost 2 million taxonomic names or a controlled vocabulary with 270,000 terms. rentrez has been designed to make it easy to search for and download NCBI records and download them from within an R session.
Read the rest here: