SampleFix: Learning to Correct Programs by Efficient Sampling of Diverse Fixes

Hossein Hajipour; Apratim Bhattacharyya; Mario Fritz

SampleFix: Learning to Correct Programs by Efficient Sampling of Diverse Fixes

Hossein Hajipour, Apratim Bhattacharyya, Mario Fritz

Published: 03 Nov 2020, Last Modified: 05 May 2023NeurIPS 2020 CAP WorkshopReaders: Everyone

Keywords: Automatic program repair, generative models, conditional variational autoencoder

Abstract: Automatic program correction holds the potential of dramatically improving the productivity of programmers. Recent advances in machine learning and NLP have rekindled the hope to eventually fully automate the process of repairing programs. A key challenge is ambiguity, as multiple codes -- or fixes -- can implement the same functionality, and there is uncertainty on the intention of the programmer. As a consequence, datasets by nature fail to capture the full variance introduced by such ambiguities. Therefore, we propose a deep generative model to automatically correct programming errors by learning a distribution over potential fixes. Our model is formulated as a deep conditional variational autoencoder that can efficiently sample diverse fixes for a given erroneous program. In order to account for inherent ambiguity and lack of representative datasets, we propose a novel regularizer to encourage the model to generate diverse fixes. Our evaluations on common programming errors show strong improvements over the state-of-the-art approaches.

2 Replies

Loading