This page documents vimcheck package design decisions as a guide to users, and to potential contributors seeking to extend this package or revise some of those decisions.
Some design decisions that live in the background include:
vimcheck is intended to collect, house, and be the single source for functionality used by VIMC to check submitted modelling outputs for discrepancies. This solves the immediate problem that this functionality is currently spread and repeated over multiple reports, increasing the potential for discrepancies and functionality drift in the tools. The overall goal is to improve the quality of VIMC’s work by improving the reliability of VIMC outputs. The main users are currently intended to be members of the VIMC Secretariat, but may include VIMC consortium members in future.
vimcheck is currently developed in bursts, with each burst so far adding a set of data wrangling and plotting functions taken from a specific VIMC report.
The package has two axes of organisation for its functionality: the theme or goal of the report from which the function comes, and what the function does.
The current reports from which functions have been taken relate to:
Functions are split into four categories, and the general idea is to have functionality be modular. As an example, vimcheck favours functions that produce intermediate products that can be reused by multiple downstream functions (following the DRY principle).
The general idea is to be able to set up small R pipelines within reports of the following form.
# some data read in from a local source
data |>
fn_wrangle_data() |>
fn_prep_data_for_plotting() |>
fn_plot_data()The function reference in the documentation is organised similarly.
The R source code files in ./R/ are also organised in
this way; for example R/fn_burden_diagnostics.R holds
data-wrangling functions related to burden estimates,
R/fn_plotting_prep_bur_diag.R holds functions to prepare
wrangled data for plotting, and
R/fn_plotting_burden_diagnostics.R holds plotting functions
for the prepared data.
vimcheck includes some package data which is used to demonstrate and test its functionality. Some data is purely dummy data that follows the structure of data seen in VIMC reports. However, some data such as [eg_impact] is real VIMC data that has been released publicly as part of other packages.
There is a number of package constants, which are single values or small vectors that are provided with and exported from the package.
We only list notable dependencies here.
Tidyverse packages over base R or data.table; this is to keep functions within the dependency framework used in the reports from which they come — we assume the report writers are also the package user-base and vimcheck aims to be used, and friendly to use, for these people.
cli and glue for string interpolation and printing error messages to screen.
diffdf to provide differences between data.frames.
ggplot2 for plotting; functions are not explicitly namespaced in many cases, but imported from the package to reduce code clutter in plotting function files.
checkmate for input checking and to extend testthat.
vdiffr for snapshot tests of plotting functions.
Data-wrangling functions are agnostic to the type of tabular input,
but always return a tibble
rather than a plain data.frame (if they return tabular data). This is
because internal manipulation using Tidyverse functions often results in
tibbles being produced (e.g. using tidyr::pivot_*(), or
dplyr::group_by() followed by ungrouping), but inexplicably
some Tidyverse functions preserve data.frames.
We think it is preferable for users and developers to have a uniform
function output type rather than have to guess whether it will be a
tibble or a data.frame. A second reason is that tibbles are much easier
to read when printed to screen.
Note that all downstream functions — plotting preparation and plotting functions — that expect tabular data expect a tibble, and will error if they are not passed a tibble! This is partially to create some friction so that users check what they are passing: data processed with vimcheck will always return a tibble, downstream functions only work on processed data, and errors might indicate that the wrong data are being passed.
vimcheck function are tested using package data (see above). As a result, tests focus on input checking and the form of outputs. There are comparatively few tests on correctness (e.g. are output numbers within a range), and this is a clear avenue for further development.