HANDPICK: A Nice Defect Prediction In Complex Kinds

HANDPICK: A Nice Defect Prediction In Complex Kinds

ACL ARR 2025 February Submission4087 Authors

15 Feb 2025 (modified: 09 May 2025)ACL ARR 2025 February SubmissionEveryoneRevisionsBibTeXCC BY 4.0

Abstract: Software defect prediction serves as a critical precursor task to software defect detection. In recent years, most research efforts have focused on leveraging static code metrics for this task, yet such approaches face cross-project generalization challenges due to the absence of code semantic features. While emerging studies recognize the importance of code semantics, the lack of high-quality open-source datasets persists due to the prohibitive costs of large-scale manual annotation. With the remarkable capabilities demonstrated by Large Language Models (LLMs) like GPT in data synthesis tasks, we propose leveraging LLMs for automated software defect data synthesis and partially open-sourcing the generated datasets. Our methodology employs Common Weakness Enumeration(CWE) as the defect taxonomy standard, designs structured prompts grounded in software engineering and defect detection principles for data sampling and labeling, and systematically analyzes both model-specific synthesis limitations and dataset quality. The experimental results reveal intriguing insights that provide new perspectives for automated software defect annotation research. (For dataset access inquiries, please contact us via email at your convenience.)

Paper Type: Long

Research Area: NLP Applications

Research Area Keywords: code generation and understanding

Contribution Types: NLP engineering experiment, Approaches to low-resource settings, Data resources

Languages Studied: Chinese, Java, JS, Python, C++

Submission Number: 4087

Loading