Relevance inquiry: a wrapper for HESA Open Data and other Higher Education statistics

I’ve used R to analyse HESA Open Data (the UK’s Higher Education Statistics Agency, alongside several other UK public bodies). Without an official API or openly published R packages, it takes some wrangling. No data is personally identifiable, focusing on providers not students. It gets very interesting when combining datasets like income models, cost lines, subject clusters, employment outcomes, diversity impact, Research Excellence Framework results, and so on.

Would a wrapper to handle the wrangling, and resulting tidyverse-friendly access to the dataset, be a relevant submission to rOpenSci? (Without any analysis in the package itself.) I think it would be an attractive dataset for students and academics dipping their toe in R, and encourage statisticians into universities’ challenges. Only the lack of an API holds us back. I have no experience with package development, so please let me know if this sparks interest!

Thanks,

Ian

1 Like

Hi @shooting.fish! Thanks for your inquiry and interest in participating in rOpenSci software review!

If you have started developing the package, would you mind opening a pre-submission inquiry on the software review repository? (see also our author guide).

If you have not started developing the package: writing such a package might be useful to you in the first place as you are an user of the data. Packaging up your code can be very handy, and our development guide lists resources to get started.

Note our development guide also has a policy section on Ethics, Data Privacy and Human Subjects Research (I see you wrote “No data is personally identifiable, focusing on providers not students”).

As a side note, in terms of topics your package idea reminds of the now archived GitHub - ropensci-archive/refimpact: ARCHIVED API Wrapper for the UK REF 2014 Impact Case Studies Database

Also, if you happen to be located… not in the UK (or are ok with the time anyhow), the following co-working event might be relevant to you:

Tuesday, 04 October 2022 9 AM Australian Western / 1:00 UTC “Start writing that package!” Hosted by community host @njtierney and @steffilazerte
* Cowork on a project of your choice;
* Take time to look up how to write a package;
* Start putting together that package you’ve always meant to;
* Or talk to Nick Tierney and others about how to get started.

Thank you for those! I have written quite a bit to slim down to the reusable. As it will be my first experience of sharing a package, the getting started links are most useful.

REF Impact studies are closely related structures, yes. As we now have results for REF 2021 (it has a very long assessment cycle) I’ll consider an updated wrapper as a way of learning good practice.

The event looks relevant, and if I’m awake then (or there’s a relevant session at the alternate time another month) I will join on Zoom.

1 Like