Keywords: compiler, instruction combining. seq2seq model, neural machine translation
TL;DR: We propose replacing the traditional instruction combiner optimization pass in LLVM compiler with a neural instruction combiner using a Seq2Seq NN model and demonstrate its feasibility
Abstract: Instruction combiner (IC) is a critical compiler optimization pass, which replaces a sequence of instructions with an equivalent and optimized instruction sequence at basic block level. There can be thousands of instruction-combining patterns which need to be frequently updated as new coding styles/idioms/applications and novel hardware evolve over time. This results in frequent updates to the IC optimization pass thereby incurring considerable human effort and high software maintenance costs. To mitigate these challenges associated with the traditional IC, we design and implement a Neural Instruction Combiner (NIC) and demonstrate its feasibility by integrating it into the standard LLVM compiler optimization pipeline. NIC leverages neural Seq2Seq model techniques for generating optimized encoded Intermediate Representation (IR) sequence from the unoptimized encoded IR sequence. To the best of our knowledge, ours is the first work demonstrating the feasibility of a neural instruction combiner built into a full-fledged compiler pipeline. Given the novelty of this task, we built a new dataset for training our NIC neural model. We show that NIC achieves exact match results percentage of $72\%$ for optimized sequences as compared to traditional IC and Bleu precision score of $0.94$, demonstrating its feasibility in a production compiler pipeline.
1 Reply
Loading