Field data collection workflow


For disciplines that do fieldwork in remote locations to collect research data, there may be some potential for R to help with the data collection workflow. Open Data Kit is a popular tool for mobile data collection for Android devices. Their data collection app is called ODK Collect, and supports a wide variety of inputs (text, number, location, multimedia, barcodes, images, video, etc.), and works great offline.

Currently, the user has a variety of options for designing their data collection form, such as a web tool, a python library, or an app that will convert an Excel sheet to the ODK form. The form itself is an XML file.

An R package for designing the ODK Collect form would allow the user to create their forms with R code, so they can make form-building reproducible, version controlled and well documented. Plus the obvious advantages (for the monoglot R user, at least) of having R at the start and the end of their data collection and analysis workflow.

A first pass at this might be converting the pyxform python package (which makes XForms) to R.


That sounds awesome! Can I suggest putting it on the wishlist?


Looks useful. I second @Ironholds.


Thanks @Ironholds and @thosjleeper, I’ve added it to the wishlist and created a wiki page for it.


I just came across this related effort: the koboloadeR package, and the blog post

It’s an R package for interacting with the KoBo Toolbox data collection system, by connecting to the Kobo API. The Kobo Toolbox is built on top of the Open Data Kit platform (and is perhaps a little more user-friendly for mobile data collection, online or offline).


Thanks Ben. Do you think this does everything you’d need?


Good question, I haven’t had a chance to test it yet. I’ll keep you posted!


Check it man and Kindly help me to configure it with my R to analyze survey data. #koboloadeR package


What’s your question more specifically @KizitoKojwang ?


We are currently using ODK to collect data from a number of project in multiple countries. We are now starting to build shiny dashboard that access the data from googlesheets effectively giving the data collector real time access (once they have a network connection) to view their data. What would be great is not only a package that would be able to build to ODK questionnaires but could also deal with the raw data that is stored in the googlesheets ie: using dplyr and data.table to transform the responses from numerics back to human readable responses.

As i’ll be do this as part of my job i’d be happy to colab on any development of an R package for dealing with ODK data and its dissemination.