CTDip: a diversity-guided test program synthesis approach for boosting compiler bug detection

Published: 2025, Last Modified: 07 Jan 2026Empir. Softw. Eng. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0
Abstract: Compiler testing is an important task for assuring the quality of compilers. However, most mutation-based compiler testing approaches still suffer from the effectiveness issue due to ineffective mutation strategies. In this paper, we propose CTDip, a diversity-guided test program synthesizing approach to discover more optimizing compiler bugs. Its key insight is to synthesize diverse and effective bug-triggering test programs by integrating historically bug-triggering structures into the seeds via a mutator scheduling strategy. Specifically, CTDip first examines the test programs that cause historical compiler bugs, and identifies four categories of bug-triggering structures to help generate new bug-triggering test programs. Then, CTDip conducts AST-level mutations to integrate the extracted structures into seed programs, and utilizes a mutator scheduler which iteratively schedules mutators to synthesize diverse test programs. Finally, given the generated test programs, CTDip leverages differential testing based on local hash checksum to test compilers. The experiments on GCC and LLVM show that, CTDip outperforms four state-of-the-art approaches (i.e., GrayC, Clang-Fuzzer, universalmutator, and Csmith) in detecting more compiler bugs, and improves the line coverage of GCC and LLVM by 3.38%\(\sim \)22.40%. Moreover, CTDip has successfully detected 21 bugs on the latest development versions of GCC and LLVM after nearly one year’s running, of which 18 have been confirmed/fixed by developers, and 14 are miscompilation bugs (the most difficult to detect).
Loading