- Abstract: We present a program synthesis-oriented dataset consisting of human written problem statements and solutions for these problems. The problem statements were collected via crowdsourcing and the program solutions were extracted from human-written solutions in programming competitions, accompanied by input/output examples. We propose using this dataset for the program synthesis tasks aimed at working with real user-generated data. As a baseline, we present few models, with the best model achieving 5.6% accuracy, showcasing both complexity of the dataset and large room for future research.
- Keywords: Program synthesis, natural language, competitive programs
- TL;DR: NAPS dataset enables the program synthesis research on real-life non-trivial programs and problem statements written in a general-purpose language.