rOpenSci package or resource used
What did you do?
COVID-19 hospitalisations in Germany are released by date of positive test rather than by date of admission. This has some advantages when they are used as a tool for surveillance as these data are closer to the date of infection and so easier to link to underlying transmission dynamics and public health interventions. Unfortunately, however, when released in this way the latest data are right-censored meaning that final hospitalisations for a given day are initially underreported. This issue is often found in data sets used for the surveillance of infectious diseases and can lead to delayed or biased decision making. Fortunately, when data from a series of days is available we can estimate the level of censoring and provide estimates for the truncated hospitalisations adjusted for truncation with appropriate uncertainty. This is usually known as a nowcast.
In this work, we aim to evaluate a series of novel semi-parametric nowcasting model formulations in real-time and provide an example workflow to allow others to do similarly using German COVID-19 hospitalisations by date of positive test at the national level both overall and by age group, and at the state level. This project is part of a wider collaboration assessing a range of nowcasting methods whilst providing an ensemble nowcast of COVID-19 Hospital admissions in Germany by date of positive test.
All models are implemented using the epinowcast
R package. The nowcasting and evaluation pipeline is implemented using the targets
R package. All input data, interim data, and output data are available and should also be fully reproducible from the provided code. Please see the resources section for details. Further details on our methodology are included in our paper.
URL or code snippet for your use case
https://epiforecasts.io/eval-germany-sp-nowcasting/
Image
Sector
academic
Field(s) of application
epidemiology
Comments
In general, the targets
ecosystem is well developed and easy to use. For my use case, the currently big missing features are integration with cloud compute services and transient containerised workflows. In many ways, these are not issues with targets
itself but instead with the wider ecosystem of R packages supporting modern distributed workflows. I am still exploring targets
workflows (see here for another example not using Rmarkdown
) so any feedback, hints, tips are very much appreciated.
piggyback
is a really nice and simple way to share data (via GitHub releases) that being said support for automatic file tracking vs manual uploading results would likely greatly improve the workflow. It is also not entirely clear to me if piggyback
is the current recommended choice for sharing scientific data (there are quite a few options with osfr also being relatively okay to use if not seamless). Clarification of current best practices for data workflows and the tools that support them would be very useful for improving my practice.