A package for dimensionality reduction of large data

Authors: Sean Hughes, Angela Li, Ju Kim, Malisa Smith, Ted Laderas

This post describes a project from rOpenSci unconf18. In the spirit of exploration and experimentation at our unconferences, projects are not necessarily finished products or in scope for rOpenSci packages.

A few weeks ago, as part of the rOpenSci Unconference, a group of us decided to work on making the UMAP algorithm accessible within R. UMAP (Uniform Manifold Approximation and Projection) is a dimensionality reduction technique that allows the user to reduce high dimensional data (multiple columns) into a smaller number of columns for visualization purposes. It is similar to both Principal Components Analysis (PCA) and t-SNE, which are techniques often used in the single-cell omics (such as genomics, flow cytometry, proteomics) world to visualize high dimensional data.

Read the full post about the umapr package, including profiling runtime and memory use on different datasets as well as a Shiny app: https://ropensci.org/blog/2018/08/01/umapr/