Abstract: Traditional decompilers utilize countless hardcoded rules written by subject matter experts, making them inflexible. Some recent systems address this using deep learning. The current consensus is that these systems have to include considerable domain knowledge and iterative heuristic components to solve parts of the decompilation problem, particularly the problem of predicting identifiers and literals. In this paper, we present a single-pass end-to-end neural decompilation system that utilizes copying mechanism. The copying mechanism is able to copy the literals and (offsets of) variables directly from the assembly code, in a single step, as part of the single forward pass through the model. Additionally, we take a further step toward decompiling real-world code by addressing important programming constructs like switch statements, function definitions, and function calls. We compile a dataset of real-world programming competition code and evaluate our model on it. The method achieves a program accuracy of 73% on the hardest complexity level of our generated dataset and 51% on the real-world examples without any additional error correction (EC) techniques, which surpasses the results of previous works without EC.
Loading