Our paper "Practical computational reproducibility in the life sciences" on bioRXiv
Our paper "Practical computational reproducibility in the life sciences", written with Johannes Köster, Ryan Dale and several members of the US Galaxy team, is now available as preprint on bioRXiv.
Abstract
Many areas of research suffer from poor reproducibility. This problem is particularly acute in computationally intensive domains where results rely on a series of complex methodological decisions that are not well captured by traditional publication approaches. Various guidelines have emerged for achieving reproducibility, but practical implementation of these practices remains difficult. This is because reproducing published computational analyses requires installing many software tools plus associated libraries, connecting tools together into the complete pipeline, and specifying parameters. Here we present a suite of recently emerged technologies which make computational reproducibility not just possible, but, finally, practical in both time and effort. By combining a system for building highly portable packages of bioinformatics software, containerization and virtualization technologies for isolating reusable execution environments for these packages, and an integrated workflow system that automatically orchestrates the composition of these packages for entire pipelines, an unprecedented level of computational reproducibility can be achieved.