Production and test bug report classification based on transfer learning

Misoo Kim; Youngkyoung Kim; Eunseok Lee

Production and test bug report classification based on transfer learning

Misoo Kim, Youngkyoung Kim, Eunseok Lee

Published: 01 Jan 2025, Last Modified: 02 Sept 2025Inf. Softw. Technol. 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Context:Recent studies indicate that the classification of production and test bug reports can substantially enhance the accuracy of performance evaluation and the effectiveness of information retrieval–based bug localization (IRBL) for software reliability.Objective:However, manually classifying these bug reports is time-consuming for developers. This study introduces a production and test bug report classification (ProTeC) framework for automatically classifying these reports.Methods:The framework’s novelty lies in leveraging a set of production- and test-source files and employing transfer learning to address the issue of insufficient and sparse bug reports in machine-learning applications. The ProTeC framework trains and fine-tunes a source file classifier to develop a bug report classifier by transferring production-test distinguishing knowledge.Results:To validate the effectiveness and general practicality of ProTeC, we conducted large-scale experiments using 2,522 bug reports across 12 machine/deep learning model variations to train an automatic classifier. Our results, on average, demonstrate that ProTeC’s macro F1-score is 28.6% higher than that of a bug report-based classifier, and it can improve the mean average precision of IRBL by 17.6%.Conclusion:These positive trends were observed in most model variations, indicating that ProTeC consistently performs well in classifying bug reports regardless of the model used, thereby improving IRBL performance.

Loading