nodbi is a single interface to many NoSQL databases
So far we support the following databases:
- MongoDB
- Redis (server based)
- CouchDB
- Elasticsearch
- etcd
nodbi
is focused around working with data.frame’s as it’s a common data format in R, and df’s make it easy to then go downstream using other tools in your workflow.
Currently we have support for the following operations:
- Create - all databases
- Get - all databases
- Delete - all databases
- Update - just CouchDB
Install/load
Github: https://github.com/ropensci/nodbi
CRAN: CRAN - Package nodbi
# only source avail. right now, binaries avail. soon
install.packages("nodbi", type = "source")
library("nodbi")
Initialize a connection
Before initializing connections to databases, make sure your database is started if it’s server based (there is the potential of supporting serverless databases, in which case there’s no server to start).
There’s a family of functions that start with src_
that you use to set your connection details for each database we support. You then pass that connection object on to any of the functions docdb_create
, docdb_delete
, docdb_get
, docdb_update
.
src_couchdb()
src_elastic()
src_etcd()
src_mongo()
src_redis()
Example
We’ll use Redis moving forward. First, initialize a connection (remember to start Redis first, e.g. on the command line redis-server
):
(con <- src_redis())
#> $type
#> [1] "redis"
#>
#> $version
#> [1] ‘1.1.0’
#>
#> $con
#> <redis_api>
#> Redis commands:
#> ... cutoff
The con
object contains connection details.
Now, let’s push a data.frame into Redis from R:
library("ggplot2")
ff <- docdb_create(con, "diamonds", diamonds)
out <- docdb_get(con, "diamonds")
NROW(out)
#> [1] 161820
head(out)
#> # A tibble: 6 x 10
#> carat cut color clarity depth table price x y z
#> <dbl> <ord> <ord> <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl>
#> 1 0.23 Ideal E SI2 61.5 55 326 3.95 3.98 2.43
#> 2 0.21 Premium E SI1 59.8 61 326 3.89 3.84 2.31
#> 3 0.23 Good E VS1 56.9 65 327 4.05 4.07 2.31
#> 4 0.290 Premium I VS2 62.4 58 334 4.2 4.23 2.63
#> 5 0.31 Good J SI2 63.3 58 335 4.34 4.35 2.75
#> 6 0.24 Very Good J VVS2 62.8 57 336 3.94 3.96 2.48
And they’re identical! (even down to maintaining the factor classes in the diamonds data.frame; note that not all database connectors suppport maintaining column classes as well as the Redis one does)
identical(diamonds, out)
#> [1] TRUE
You can easily fold a NoSQL database into your data munging worklow. Let’s say you’re using dplyr
, and you want to get dat out of Redis then munge with dplyr
.
We can easily do with one line of code to get the data from Redis, then do whatever youre munging heart desires.
library("dplyr")
docdb_get(con, "diamonds") %>%
group_by(cut) %>%
summarise(mean_depth = mean(depth), mean_price = mean(price))
#> # A tibble: 5 x 3
#> cut mean_depth mean_price
#> <ord> <dbl> <dbl>
#> 1 Fair 64.0 4359.
#> 2 Good 62.4 3929.
#> 3 Very Good 61.8 3982.
#> 4 Premium 61.3 4584.
#> 5 Ideal 61.7 3458.
Let us know what you think. Features? Bugs? Additional databases we should support?