Enriching automatic test case generation by extracting relevant test inputs from bug reports

Wendkûuni C. Ouédraogo; Laura Plein; Abdoul Kader Kaboré; Andrew Habib; Jacques Klein; David Lo; Tegawendé F. Bissyandé

Enriching automatic test case generation by extracting relevant test inputs from bug reports

Wendkûuni C. Ouédraogo, Laura Plein, Abdoul Kader Kaboré, Andrew Habib, Jacques Klein, David Lo, Tegawendé F. Bissyandé

Published: 01 Jan 2025, Last Modified: 17 Apr 2025Empir. Softw. Eng. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: The quality of software is closely tied to the effectiveness of the tests it undergoes. Manual test writing, though crucial for bug detection, is time-consuming, which has driven significant research into automated test case generation. However, current methods often struggle to generate relevant inputs, limiting the effectiveness of the tests produced. To address this, we introduce BRMiner, a novel approach that leverages Large Language Models (LLMs) in combination with traditional techniques to extract relevant inputs from bug reports, thereby enhancing automated test generation tools. In this study, we evaluate BRMiner using the Defects4J benchmark and test generation tools such as EvoSuite and Randoop. Our results demonstrate that BRMiner achieves a Relevant Input Rate (RIR) of 60.03% and a Relevant Input Extraction Accuracy Rate (RIEAR) of 31.71%, significantly outperforming methods that rely on LLMs alone. The integration of BRMiner’s input enhances EvoSuite ability to generate more effective test, leading to increased code coverage, with gains observed in branch, instruction, method, and line coverage across multiple projects. Furthermore, BRMiner facilitated the detection of 58 unique bugs, including those that were missed by traditional baseline approaches. Overall, BRMiner’s combination of LLM filtering with traditional input extraction techniques significantly improves the relevance and effectiveness of automated test generation, advancing the detection of bugs and enhancing code coverage, thereby contributing to higher-quality software development.

Loading