OneMoreTest: A Learning-Based Approach to Generating and Selecting Fault-Revealing Unit Tests

Wei Wei, Yanjie Jiang, Yahui Li, Lu Zhang, Hui Liu

Published: 01 Aug 2025, Last Modified: 25 Jan 2026IEEE Transactions on Software EngineeringEveryoneRevisionsCC BY-SA 4.0
Abstract: Developers often manually design a few unit tests for a given method under development. After passing such manually designed tests, however, they usually have to turn to automated test case generation tools like EvoSuite and Randoop for more thorough testing. Although the automatically generated tests may achieve a high coverage, they rarely identify hard-to-detect defects automatically because of the well-known test oracle problem: It is challenging to tell whether the output is correct or incorrect without explicit test oracle (expected output). Consequently, developers should manually select and verify a few suspicious test cases to identify hard-to-detect defects. To this end, in this paper, we propose a novel approach, called OneMoreTest, to generating and selecting the most suspicious tests for manual verification. Based on a manually designed passed test, OneMoreTest automatically generates millions of input-output pairs for the method under test (MUT) with mutation-based fuzzing. It then trains an automatically generated neural network to simulate the MUT’s behavior. For new tests automatically generated for the same MUT, OneMoreTest suggests developers with the top $k$ most suspicious tests that have the greatest distances between their actual output and estimated output (i.e., network’s output). Our evaluation on real-world faulty methods suggests that OneMoreTest is accurate. On 70.79% of the involved 178 real-world faulty methods, we can identify the defects by manually verifying only a SINGLE test for each of the methods according to OneMoreTest’s suggestions. Compared against the state of the art, OneMoreTest improved the precision from 46.63% to 72.62%, and recall from 46.63% to 70.79%.
Loading