- TL;DR: We improve generative models by proposing a meta-algorithm that filters new training data from the model's outputs.
- Abstract: Many challenging prediction problems, from molecular optimization to program synthesis, involve creating complex structured objects as outputs. However, available training data may not be sufficient for a generative model to learn all possible complex transformations. By leveraging the idea that evaluation is easier than generation, we show how a simple, broadly applicable, iterative target augmentation scheme can be surprisingly effective in guiding the training and use of such models. Our scheme views the generative model as a prior distribution, and employs a separately trained filter as the likelihood. In each augmentation step, we filter the model's outputs to obtain additional prediction targets for the next training epoch. Our method is applicable in the supervised as well as semi-supervised settings. We demonstrate that our approach yields significant gains over strong baselines both in molecular optimization and program synthesis. In particular, our augmented model outperforms the previous state-of-the-art in molecular optimization by over 10% in absolute gain.
- Code: https://www.dropbox.com/s/87v6w4aab2txg2y/iterative-target-augmentation.zip?dl=0
- Keywords: data augmentation, generative models, self-training, molecular optimization, program synthesis