Keywords: Reproducibility
TL;DR: Lack of documentation in published research can make independent replication an unnecessary laborious task. We introduce this concept as the cost of reproducibility and analyse it over 1061 papers.
Abstract: **Background.** The reproducibility crisis has not left artificial intelligence untouched.
Lack of documentation in published research can make independent replication an
unnecessarily laborious task. We propose the cost of reproducibility as the labour
required to reproduce a method and its results due to lacking documentation.
**Objectives.** We aim to quantify the cost of reproducibility to determine significant
variation between venues. We hypothesise that studies published in venues with
strict reproducibility requirements in the review process are less costly to reproduce.
**Methods.** We propose five dimensions of the cost of reproducibility and evaluate
them on a scale of 1 to 10, using objective characteristics e.g., availability of code,
data, parameter values and experiment setup. We reviewed 1061 papers published
between 2022-2024 from AAAI, ICLR, ICML, IJCAI, JAIR, JMLR and NeurIPS.
**Results.** Machine learning conferences are up to 16.52% less costly to reproduce
than artificial intelligence conferences and 12.91% than journals. Award-winning
papers are not less costly to reproduce than average papers at the same venue.
**Conclusions.** By quantifying the reproducibility cost, we find that the effectiveness
of reproducibility standards depends on community support and strict enforcement
in the review process, to significantly lower cost. We encourage the publication
of appendices and reproducibility checklists, and a low cost as a key criterion for
paper awards to drive community changes with examples of best practices.
awards to drive community changes with examples of best practices.
Supplementary Material: zip
Primary Area: other topics in machine learning (i.e., none of the above)
Submission Number: 13504
Loading