Abstract: Program synthesis has recently emerged as a promising approach to the image parsing task. Most prior work has followed a two-step pipeline that involves supervised pretraining of a sequence-to-sequence model on synthetic programs, followed by reinforcement learning to fine-tune the model on real reference images. A purely RL-driven approach without supervised pretraining has never proven successful, since useful programs are too sparse in the program space to sample from. In this paper, we present the first purely unsupervised learning algorithm that parses constructive solid geometry (CSG) images into context-free grammar (CFG) without pretraining. The key ingredients of our approach include entropy regularization and sampling without replacement from the CFG syntax tree, both of which encourage exploration of the search space, and a grammar-encoded tree LSTM that enforces valid outputs. We demonstrate that our unsupervised method can generalize better than supervised training on a synthetic 2D CSG dataset. Additionally, we find that training in this way naturally optimizes the quality of the top-$k$ programs, leading our model to outperform existing method on a 2D computer aided design (CAD) dataset under beam search.
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
One-sentence Summary: We parse CSG images into CFG program by reinforcement learning approach without supervised pretraining.
Supplementary Material: zip
Reviewed Version (pdf): https://openreview.net/references/pdf?id=3TEkMCJgQU
1 Reply
Loading