Keywords: Automatic program repair, generative models, conditional variational autoencoder
Abstract: Automatic program correction holds the potential of dramatically improving the productivity of programmers. Recent advances in machine learning and NLP have rekindled the hope to eventually fully automate the process of repairing programs. A key challenge is ambiguity, as multiple codes -- or fixes -- can implement the same functionality, and there is uncertainty on the intention of the programmer. As a consequence, datasets by nature fail to capture the full variance introduced by such ambiguities. Therefore, we propose a deep generative model to automatically correct programming errors by learning a distribution over potential fixes. Our model is formulated as a deep conditional variational autoencoder that can efficiently sample diverse fixes for a given erroneous program. In order to account for inherent ambiguity and lack of representative datasets, we propose a novel regularizer to encourage the model to generate diverse fixes. Our evaluations on common programming errors show strong improvements over the state-of-the-art approaches.
2 Replies
Loading