Abstract: The concern that Artificial Intelligence (AI) and Machine Learning (ML) are entering a ``reproducibility crisis'' has spurred significant research in the past few years. Yet with each paper, it is often unclear what someone means by ``reproducibility'' and where it fits in the larger scope of what we will call the ``scientific rigor'' literature. Ultimately, the lack of clear rigor standards can affect the manner in which businesses seeking to adopt AI/ML implement such capabilities. In this survey, we will use 66 papers published since 2017 to construct a proposed set of 8 high-level categories of scientific rigor, what they are, and the history of work conducted in each. Our proposal is that these eight rigor types are not mutually exclusive and present a model for how they influence each other. To encourage more to study these questions, we map these rigors to the adoption process in real-world business use cases. In doing so, we can quantify gaps in the literature that suggest an under focus on the issues necessary for scientific rigor research to transition to practice.
Submission Length: Regular submission (no more than 12 pages of main content)
Assigned Action Editor: ~Cho-Jui_Hsieh1
Submission Number: 1322
Loading