Abstract: We present a program synthesis-oriented dataset consisting of human written problem statements and solutions for these problems. The problem statements were collected via crowdsourcing and the program solutions were extracted from human-written solutions in programming competitions, accompanied by input/output examples. We propose using this dataset for the program synthesis tasks aimed at working with real user-generated data. As a baseline, we present few models, with the best model achieving 5.6% accuracy, showcasing both complexity of the dataset and large room for future research.
Keywords: Program synthesis, natural language, competitive programs
TL;DR: NAPS dataset enables the program synthesis research on real-life non-trivial programs and problem statements written in a general-purpose language.
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/naps-natural-program-synthesis-dataset/code)
6 Replies
Loading