ATOC: Automated Test Oracle Construction Based on Large Language Models

01 Mar 2025 (modified: 02 Mar 2025)XJTU 2025 CSUC SubmissionEveryoneRevisionsBibTeXCC BY 4.0
Keywords: Test Oracle, Large Language Models, Automation
Abstract: Deep learning (DL) frameworks are now becoming more and more popular due to their wide applications in society, while testing DL frameworks presents immense obstacles despite their required high reliability. Current DL framework testing still focuses on assessing the models or APIs by running themselves using various fuzzing tools, neglecting the application of large language models (LLMs) which may assist in constructing test oracles automatically. To the best my knowledge, it's the first time to try to generate test oracles for DL Libraries automatically based on LLMs. A pivotal challenge in DL framework testing is accurately giving out the expected test oracles for a given input, which demands profound understanding of both the DL frameworks and its internal operating limitations. Traditional testing approaches, relying on manually coding test cases and test oracles, fall short when meeting with the complexity and extensive using of current DL frameworks. However, LLMs can simplify the process. Thus, I propose ATOC, an innovative strategy that utilizes LLMs to automatically construct test oracles for DL frameworks. By utilizing LLMs' advanced natural language processing abilities, ATOC analyzes and comprehends DL frameworks, facilitating automated test oracle generation and expected output determination. To evaluate the effectiveness of my approach, I apply ATOC on testing Pytorch framework with 1500 APIs, and the results demonstrate that ATOC is effective in detecting bugs and inconsistencies, especially on crashes (100.0\%), flaky (75.0\%) and hangs (66.7\%).
Submission Number: 29
Loading

OpenReview is a long-term project to advance science through improved peer review with legal nonprofit status. We gratefully acknowledge the support of the OpenReview Sponsors. © 2025 OpenReview