Harnessing NLP for test case prioritization: unsupervised approaches

Andrea Vignali, Giancarlo Sperlí, Simon Pietro Romano

Published: 2025, Last Modified: 01 Apr 2026IJCNN 2025EveryoneRevisionsBibTeXCC BY-SA 4.0

Abstract: Designing and testing modern network systems is a complex and costly process, particularly during the testing phase, where time constraints and unpredictability often lead to inflated development efforts. In this paper, we propose a novel methodology that leverages natural language comments from developer commits to optimize test execution. By extracting meaningful insights from comments, we construct a prioritized task list, focusing on those tasks that are most likely to fail. This approach streamlines the testing workflow, accelerates execution, and reduces overall development costs. Our pipeline combines transformer-based models for semantic understanding with unsupervised anomaly detection methods, including clustering algorithms, autoencoders (AE), and variational autoencoders (VAE), to identify failure-prone scenarios without requiring labeled data. Applied to a real-world industrial dataset, the AE model achieves an F1 score of 0.838, and the VAE follows closely with 0.805, significantly outperforming traditional clustering approaches. These results highlight the effectiveness of leveraging commit metadata for intelligent test prioritization and demonstrate the potential for scalable improvements in continuous integration pipelines.

External IDs:dblp:conf/ijcnn/VignaliSR25