Abstract: MLIR (Multi-Level Intermediate Representation) compiler infrastructure has gained popularity in recent years to support the construction of many compilers. Instead of designing a new IR with a single abstraction for each domain, MLIR compiler infrastructure provides systematic passes to support a wide range of functionalities for benefiting multiple domains together and introduces dialects to support different levels of abstraction in MLIR. Due to its fundamental role in compiler community, ensuring its quality is very critical. In this work, we propose MLIRSmith, the first fuzzing technique for MLIR compiler infrastructure. MLIRSmith employs a two-phase strategy to generate valid and diverse MLIR programs, which first constructs diverse program templates guided by extended MLIR syntax rules and then generates valid MLIR programs through template instantiation guided by our designed context-sensitive grammar. After applying MLIRSmith to the latest revision of MLIR compiler infrastructure, we detected 53 previously unknown bugs, among which 49/38 have been confirmed/fixed by developers. We also transform the high-level programs generated by NNSmith (a high-level program generator for deep learning compilers) to MLIR programs for indirectly fuzzing MLIR compiler infrastructure. During the same testing time, MLIRSmith largely outperforms such an indirect technique by detecting 328.57% more bugs and covering 194.67%/225.87% more lines/branches in MLIR compiler infrastructure.
Loading