[skimr] why can't I remove skimmers from the skimmer list and add new skimmers in the same instruction?


#1

Hi,

I hope this is the right place for this kind of question. Please, let me know if I’m wrong.

I’m using skimr, and I would like to change the default list of summary functions (“skimmers”) for class numeric. Specifically, I try to remove the hist skimmer as well as all skimmers related to quantiles (except for median), and add an iqr skimmer. My goal is to have only the usual statistics for (approximately) normally distributed RVs (mean & sd) and nonnormally distributed RVs (median & iqr), together with the usual stuff which helps you spot NAs (missing, complete, total).

library(skimr)

# test data frame
foo <- data.frame(x = c(1, 2, 3), y = c(1, 2, NA))

# if I try to remove skimmers and add new skimmers in the same instruction, skimr doesn't comply 
skim_with(numeric = list(p0 = NULL, p25 = NULL, p75 = NULL, p100 = NULL, 
hist = NULL, iqr = function(x) IQR(x, na.rm = TRUE)))
skim(foo)
# Skim summary statistics
# n obs: 3 
# n variables: 2 
#
# Variable type: numeric 
#  variable missing complete n mean   sd  p0 p25 median p75 p100 hist iqr
#        x       0        3 3  2   1    1   1      2   1    1    1   1  
#        y       1        2 3  1.5 0.71 0.5 0.5    1.5 0.5  0.5  0.5 0.5

# instead, if I first remove skimmers,  and then add the new skimmer in a successive instruction, it works
skim_with_defaults()
skim_with(numeric = list(p0 = NULL, p25 = NULL, p75 = NULL, p100 = NULL, hist = NULL))
skim_with(numeric = list(iqr = function(x) IQR(x, na.rm = TRUE)))
skim(foo)
# Skim summary statistics
# n obs: 3 
# n variables: 2 
# 
# Variable type: numeric 
# variable missing complete n mean   sd median iqr
# x       0        3 3  2   1       2   1  
# y       1        2 3  1.5 0.71    1.5 0.5

Why is this? Especially when I want to modify skimmers for multiple classes at the same time, it’s a bit annoying to use multiple instructions, even if of course it’s not a major issue.


#2

Hi,

You should add append=FALSE to your skim_with() command. That will replace the skimmers for any types you provide skimmers for.


#3

@elinw thanks for the reply and apologies for the delay. I tried your suggestion but it doesn’t seem to work (or I misunderstood you):

foo <- data.frame(x = c(1, 2, 3), y = c(1, 2, NA))
skim_with(numeric = list(p0 = NULL, p25 = NULL, p75 = NULL, p100 = NULL, hist = NULL, iqr = function(x) IQR(x, na.rm = TRUE)), append = FALSE)
# skim(foo)
# Error in .x(x) : could not find function ".x"

Do you know what I’m doing wrong?


#4

Like this:]

df <- data.frame(x = c(1, 2, 3), y = c(1, 2, NA))
 skim_with(numeric = list( iqr = function(x) IQR(x, na.rm = TRUE)), append = FALSE)
 skim(df)

Probably you should be able to do it your way though. Would you file an issue in the tracker?


#5

Sorry Andrea! That’s a bug. Sorry about that. There’s a PR into the development version of skimr to fix this.


#6

@michaelquinn32 no problem! I will use two separate commands for now, and look forward to a new version of your great package :slight_smile: