Abstract: We study the problem of generating source code in a strongly typed,
Java-like programming language, given a label (for example a set of
API calls or types) carrying a small amount of information about the
code that is desired. The generated programs are expected to respect a
`"realistic" relationship between programs and labels, as exemplified
by a corpus of labeled programs available during training.
Two challenges in such *conditional program generation* are that
the generated programs must satisfy a rich set of syntactic and
semantic constraints, and that source code contains many low-level
features that impede learning. We address these problems by training
a neural generator not on code but on *program sketches*, or
models of program syntax that abstract out names and operations that
do not generalize across programs. During generation, we infer a
posterior distribution over sketches, then concretize samples from
this distribution into type-safe programs using combinatorial
techniques. We implement our ideas in a system for generating
API-heavy Java code, and show that it can often predict the entire
body of a method given just a few API calls or data types that appear
in the method.
TL;DR: We give a method for generating type-safe programs in a Java-like language, given a small amount of syntactic information about the desired code.
Keywords: Program generation, Source code, Program synthesis, Deep generative models
Code: [![github](/images/github_icon.svg) capergroup/bayou](https://github.com/capergroup/bayou)
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/neural-sketch-learning-for-conditional/code)
8 Replies
Loading