Self-Supervised Bug Detection and Repair

Miltiadis Allamanis; Henry Richard Jackson-Flux; Marc Brockschmidt

Self-Supervised Bug Detection and Repair

Miltiadis Allamanis, Henry Richard Jackson-Flux, Marc Brockschmidt

Published: 09 Nov 2021, Last Modified: 26 May 2025NeurIPS 2021 PosterReaders: Everyone

Keywords: ml4code, bug detection, gnn

Abstract: Machine learning-based program analyses have recently shown the promise of integrating formal and probabilistic reasoning towards aiding software development. However, in the absence of large annotated corpora, training these analyses is challenging. Towards addressing this, we present BugLab, an approach for self-supervised learning of bug detection and repair. BugLab co-trains two models: (1) a detector model that learns to detect and repair bugs in code, (2) a selector model that learns to create buggy code for the detector to use as training data. A Python implementation of BugLab improves by 30% upon baseline methods on a test dataset of 2374 real-life bugs and finds 19 previously unknown bugs in open-source software.

Code Of Conduct: I certify that all co-authors of this work have read and commit to adhering to the NeurIPS Statement on Ethics, Fairness, Inclusivity, and Code of Conduct.

TL;DR: Learn to detect a variety of bugs in source code by asking two models to play a hide-and-seek game: one model inserts a bug, the other tries to find it.

Supplementary Material: pdf

Code: https://github.com/microsoft/neurips21-self-supervised-bug-detection-and-repair/

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 16 code implementations](https://www.catalyzex.com/paper/self-supervised-bug-detection-and-repair/code)

9 Replies

Loading