Reproducible Research Environments with Repo2Docker

Jessica Forde, Tim Head, Chris Holdgraf, Yuvi Panda, Gladys Nalvarete, Benjamin Ragan-Kelley, Erik Sundell

Jun 11, 2018 ICML 2018 RML Submission readers: everyone
  • Abstract: Reproducibility challenges in machine learning often center on questions of software engineering practices. Researchers struggle to reproduce another scientist's work because they cannot translate a paper into code with similar results or run an author's code. repo2docker provides a simple tool for checking the minimum requirements to reproduce a paper by building a Docker image based on a repository path or URL. Its goal is to minimize the effort needed to convert a static repository into a working software environment. By inspecting a repository for standard configuration files used in contemporary software engineering and leveraging containerization methods, repo2docker deterministically reproduces the environment of the author so the researcher can reproduce the author's experiments.
  • TL;DR: repo2docker uses standard configuration files to reproduce the environment of a repository in a Docker image.
  • Keywords: reproducibility, docker, python
0 Replies