NAPS: Natural Program Synthesis Dataset


Jun 01, 2018 ICML 2018 Workshop NAMPI
  • Abstract: We present a program synthesis-oriented dataset consisting of human written problem statements and solutions for these problems. The problem statements were collected via crowdsourcing and the program solutions were extracted from human-written solutions in programming competitions, accompanied by input/output examples. We propose using this dataset for the program synthesis tasks aimed at working with real user-generated data. As a baseline, we present few models, with the best model achieving 5.6% accuracy, showcasing both complexity of the dataset and large room for future research.
  • Keywords: Program synthesis, natural language, competitive programs
  • TL;DR: NAPS dataset enables the program synthesis research on real-life non-trivial programs and problem statements written in a general-purpose language.
