Learning to Infer Run-Time Invariants from Source codeDownload PDF

28 Sept 2020 (modified: 05 May 2023)ICLR 2021 Conference Blind SubmissionReaders: Everyone
Keywords: Invariants, Software Engineering, Programming Languages
Abstract: Source code is notably different from natural language in that it is meant to be executed. Experienced developers infer complex "invariants" about run-time state while reading code, which helps them to constrain and predict program behavior. Knowing these invariants can be helpful; yet developers rarely encode these explicitly, so machine-learning methods don't have much aligned data to learn from. We propose an approach that adapts cues within existing if-statements regarding explicit run-time expectations to generate aligned datasets of code and implicit invariants. We also propose a contrastive loss to inhibit generation of illogical invariants. Our model learns to infer a wide vocabulary of invariants for arbitrary code, which can be used to detect and repair real bugs. This is entirely complementary to established approaches, which either use logical engines that scale poorly, or run-time traces that are expensive to obtain; when present, that data can complement our tool, as we demonstrate in conjunction with Daikon, an existing tool. Our results show that neural models can derive useful representations of run-time behavior directly from source code.
One-sentence Summary: A technique for predicting run-time invariants from source code alone
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics
Reviewed Version (pdf): https://openreview.net/references/pdf?id=4HTJHCciCm
6 Replies

Loading