Overview#
A research project is reproducible if a different researcher can carry out the same analysis with the same data and produce the same overall result. To do so, they need transparent, detailed documentation about all of the steps in the research process and access to the tools—especially code—with which the steps were carried out. Reproducibility enables independent verification, a touchstone for all research.
There are myriad practices, often accompanied by software tools, that can help ensure research projects are reproducible. This overview workshop will help you decipher which to adopt and when to adopt them. The workshop also highlights additional benefits many of these practices confer, such as making it easier to collaborate with others. As an overview, this workshop is relatively non-technical, but provides technical references, including other DataLab workshops, for all of the practices covered.
Learning Goals#
After completing this workshop, learners should be able to:
Describe widely-used practices and tools for ensuring that research projects are reproducible, such as:
Writing documentation
File and directory naming conventions
Version control systems
Software environment management
Build systems
Packaging
Software testing
Virtualization and containerization
Explain the advantages and disadvantages of these reproducibility practices
Evaluate whether a given reproducibility practice is relevant to their research project
Identify references they can consult to learn technical details about reproducibility practices
Explain the ways in which reproducibility practices can facilitate collaboration for active research projects
Prerequisites#
This workshop is intended for learners of at all experience levels, and may benefit learners at different experience levels in different ways. There are no prerequisites and no prior programming experience is necessary.
Computing Requirements#
No specific hardware or software is required.