Case study in the development of a framework for quality and reproducibility in inner-sourced packages and self-service analytic dashboards to accelerate common study types

Abstract

These principles provided a mechanism to improve the reliability of our code, and had an important secondary effect of providing a clear framework to communicate with stakeholders and between data scientists steps taken to improve quality and reproducibility of internal study dashboards..

Publication
American Medical Informatics Association Annual Conference

The manifesto

The following 5 point manifesto are the principles we defined as the base principles we apply to interactive applications to allow us to develop more robust applications, that behave in known ways and produce reproducible results.

Code has value

The use of ad-hoc code within a study should be minimized, and a culture promoted of collaboration on pan-study code in documented packages for reuse. Unit tests are ideally written at function creation, and reviewed and expanded with new use cases.

Democratize access to analytics

Aided by the consolidation of code into packages, less technical users should be given access to well documented packages to promote guided analyses, as well as dashboards considered for fully self-service interaction.

Be verbose and specific on input cohorts

Publish cohort derivation code as user readable markdown vignettes with descriptive statistics and assumption checks.

Assertively limit user inputs in dashboards

Developers should ensure that variables exposed for manipulation in the app can be varied without compromising validation.

Separate logic from the user interface

Scientific logic should be separated from visualization code. This loosely coupled approach improves the robustness of the system by making it easier to isolate, test and document any logic applied to the data.

comments powered by Disqus

Related