Keywords: Position paper, code quality, research infrastructure
TL;DR: Codebases associated with ML research are typically an afterthought. This is bad.
Abstract: ``Research code'' is a common, often self-effacing, term used to refer to the type of code that is commonly released along side research papers. Research code is notorious for being fragile, poorly documented, and difficult for others to run or extend. In this position paper we argue that, while research code seems to meet the short-term needs of research projects, in fact the practice hurts researchers by limiting the impact of their work and causing fewer people to build on their research. We explore the structural incentives and dynamics of the field that drive these behaviors. We argue that extensibility matters far more than strict reproducibility for research impact, and propose both pragmatic approaches for individual researchers and institutional reforms to encourage the development of more usable and maintainable research software.
Submission Number: 46
Loading