Abstract: Program synthesis is the task of automatically generating a program consistent with
a specification. Recent years have seen proposal of a number of neural approaches
for program synthesis, many of which adopt a sequence generation paradigm similar
to neural machine translation, in which sequence-to-sequence models are trained to
maximize the likelihood of known reference programs. While achieving impressive
results, this strategy has two key limitations. First, it ignores Program Aliasing: the
fact that many different programs may satisfy a given specification (especially with
incomplete specifications such as a few input-output examples). By maximizing
the likelihood of only a single reference program, it penalizes many semantically
correct programs, which can adversely affect the synthesizer performance. Second,
this strategy overlooks the fact that programs have a strict syntax that can be
efficiently checked. To address the first limitation, we perform reinforcement
learning on top of a supervised model with an objective that explicitly maximizes
the likelihood of generating semantically correct programs. For addressing the
second limitation, we introduce a training procedure that directly maximizes the
probability of generating syntactically correct programs that fulfill the specification.
We show that our contributions lead to improved accuracy of the models, especially
in cases where the training data is limited.
TL;DR: Using the DSL grammar and reinforcement learning to improve synthesis of programs with complex control flow.
Keywords: Program Synthesis, Reinforcement Learning, Language Model
Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/leveraging-grammar-and-reinforcement-learning/code)
11 Replies
Loading