CRAN checks data: quick look across all check results

We maintain an API (https://cranchecks.info/) that gives you access to CRAN checks data. The code for the API lives at GitHub - sckott/cchecksapi: CRAN checks API (DEFUNCT)

We have an accompanying R package cchecks that works with the API.

The API also has badges - see e.g., on the data.table site Home · Rdatatable/data.table Wiki · GitHub - it’s the one that looks something like (exact look depends on what the checks say)

Here’s an example of using the R package

setup

remotes::install_github("ropenscilabs/cchecks")
library(cchecks)
library(dplyr)

get data

x <- lapply(seq(0, 13000, by = 1000), function(off) {
  cch_pkgs(limit = 1000, offset = off)
})
## prepare data for below
df <- tbl_df(bind_rows(lapply(x, function(z) {
  cbind(
    select(z$data, package),
    select(z$data$check_details, output, check)
  )
})))

quick overview of check results

The summary data slot tallies the number of checks across R version and
operating system that have different outcomes. We can summarize those
across packages:

stats <- tbl_df(bind_rows(lapply(x, function(w) w$data$summary)))
stats
#> # A tibble: 13,527 x 6
#>    any      ok  note  warn error  fail
#>    <lgl> <int> <int> <int> <int> <int>
#>  1 FALSE    12     0     0     0     0
#>  2 FALSE    12     0     0     0     0
#>  3 TRUE      0    12     0     0     0
#>  4 FALSE    12     0     0     0     0
#>  5 FALSE    12     0     0     0     0
#>  6 FALSE    12     0     0     0     0
#>  7 TRUE      0    10     2     0     0
#>  8 FALSE    12     0     0     0     0
#>  9 FALSE    12     0     0     0     0
#> 10 FALSE    12     0     0     0     0
#> # … with 13,517 more rows

number of packages that have any problems

stats %>% tally(any)
#> # A tibble: 1 x 1
#>       n
#>   <int>
#> 1  5901

number of packages that have notes, warnings, errors or failures

stats %>% tally(note > 0)
#> # A tibble: 1 x 1
#>       n
#>   <int>
#> 1  4985
stats %>% tally(warn > 0)
#> # A tibble: 1 x 1
#>       n
#>   <int>
#> 1   826
stats %>% tally(error > 0)
#> # A tibble: 1 x 1
#>       n
#>   <int>
#> 1   820
stats %>% tally(fail > 0)
#> # A tibble: 1 x 1
#>       n
#>   <int>
#> 1    30

number of packages that don’t have any checks that are ok

stats %>% tally(ok == 0)
#> # A tibble: 1 x 1
#>       n
#>   <int>
#> 1  2309

wowsers, that’s way more than I would have thought - I hope my packages
aren’t in this group, eek.

emails <- c("myrmecocystus_at_gmail.com", "myrmecocystus_r_at_gmail.com", 
    "sckott_at_protonmail.com")
me <- bind_rows(lapply(emails, function(z) {
  cch_maintainers(z)$data$table
}))
me %>% tally(ok == 0)
#>   n
#> 1 0

phew!

check title

This is that title that cran gives at top of the check output

res <- data.frame(arrange(count(df, check), desc(n)))
knitr::kable(res)
check n
NA 7627
R code for possible problems 1089
dependencies in R code 971
DESCRIPTION meta-information 785
installed package size 617
compiled code 516
package dependencies 476
whether package can be installed 466
re-building of vignette outputs 231
data for non-ASCII characters 161
examples 133
tests 84
use of SHLIB_OPENMP_*FLAGS in Makefiles 75
top-level files 69
for GNU extensions in Makefiles 61
S3 generic/method consistency 55
Rd cross-references 15
whether the namespace can be loaded with stated dependencies 14
for hidden files and directories 13
Rd line widths 11
foreign function calls 10
package namespace information 8
pragmas in C/C++ headers and code 6
Rd files 5
running examples for arch ‘i386’ 5
files in ‘vignettes’ 4
running R code from vignettes 4
running tests for arch ‘i386’ 4
include directives in Makefiles 3
for executable files 2
PDF version of manual 2
for code/documentation mismatches 1
for portable file names 1
for unstated dependencies in examples 1
if this is a source package 1
Rd \usage sections 1

Many (7627) have no check problems. Most common is

R code for possible problems

while very uncommon reasons include

PDF version of manual

and

for portable file names

check details (output)

this is the check details you get below the title above

output_sum <- arrange(count(df, output), desc(n))
output_sum$output[1:10]
#>  [1] NA                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
#>  [2] "Malformed Description field: should contain one or more complete sentences."                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
#>  [3] "Found the following significant warnings:\n  Warning: S3 methods '[.fun_list', '[.grouped_df', 'all.equal.tbl_df', 'anti_join.data.frame', 'anti_join.tbl_df', 'arrange.data.frame', 'arrange.default', 'arrange.grouped_df', 'arrange.tbl_df', 'arrange_.data.frame', 'arrange_.tbl_df', 'as.data.frame.grouped_df', 'as.data.frame.rowwise_df', 'as.data.frame.tbl_cube', 'as.table.tbl_cube', 'as.tbl.data.frame', 'as.tbl.tbl', 'as.tbl_cube.array', 'as.tbl_cube.data.frame', 'as.tbl_cube.matrix', 'as.tbl_cube.table', 'as_tibble.grouped_df', 'as_tibble.tbl_cube', 'auto_copy.tbl_cube', 'auto_copy.tbl_df', 'cbind.grouped_df', 'collapse.data.frame', 'collect.data.frame', 'common_by.NULL', 'common_by.character', 'common_by.default', 'common_by.list', 'compute.data.frame', 'copy_to.DBIConnection', 'copy_to.src_local', 'default_missing.data.frame', 'default_missing.default', 'dim.tbl_cube', 'distinct.data.frame', 'distinct.default', 'distinct.grouped_df', 'distinct.tbl_df', 'distinct_.data.frame', 'distinct_.grouped_df', 'distinct_.tbl_df', 'do.NULL', 'do.da [... truncated]"
#>  [4] "Malformed Title field: should not end in a period."                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
#>  [5] "Installation failed."                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
#>  [6] "Dependence on R version ‘3.3.1’ not with patchlevel 0"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
#>  [7] "GNU make is a SystemRequirements."                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
#>  [8] "Dependence on R version ‘3.3.2’ not with patchlevel 0"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
#>  [9] "  src/Makevars: SHLIB_OPENMP_CXXFLAGS is included in PKG_CXXFLAGS but not in PKG_LIBS\n  src/Makevars: SHLIB_OPENMP_CFLAGS is included in PKG_LIBS but linking is by C++\n  src/Makevars.win: SHLIB_OPENMP_CXXFLAGS is included in PKG_CXXFLAGS but not in PKG_LIBS\n  src/Makevars.win: SHLIB_OPENMP_CFLAGS is included in PKG_LIBS but linking is by C++\nUse of these macros is discussed in sect 1.2.1.1 of ‘Writing R\nExtensions’. The macros for different languages may differ so the\nmatching macro must be used in PKG_CXXFLAGS (etc) and match that used\nin PKG_LIBS (except for Fortran: see the manual)."                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
#> [10] "Dependence on R version ‘3.3.3’ not with patchlevel 0"

As no surprise, there’s lots of

Malformed Description field: should contain one or more complete sentences

And many

Malformed Title field: should not end in a period.

Classic DESCRIPTION file blues.

I’ve not actually seen this or its variants before:

Dependence on R version ‘3.3.1’ not with patchlevel 0

The award for the longest check string goes to:

m <- arrange(mutate(df, char_count = nchar(output)), desc(char_count))
m$package[1]
#> [1] "doFuture"

at 646051 characters

fun

:rocket:

1 Like

How did some of these packages get accepted in the first place then? :thinking:

I think alot of those might be all notes - so i guess that let those slide maybe

1 Like

Idea for next Friday fun: adding the date of publication on CRAN to see whether more notes were ignored earlier in time?

good idea.

First publication date on cran?

1 Like

Good question! Ideally both the date of first publication and the date of latest CRAN update (since the NOTE might have appeared in the meantime). :slightly_smiling_face: