Defects4C: Benchmarking C/C++ Faults to Assess LLM-Based Program Repair

Jian Jornbowrl Wang; Xiaofei Xie; Shangqing Liu; Jiaolong Kong; Jiongchi Yu; Yi Li

Defects4C: Benchmarking C/C++ Faults to Assess LLM-Based Program Repair

Jian Jornbowrl Wang, Xiaofei Xie, Shangqing Liu, Jiaolong Kong, Jiongchi Yu, Yi Li

27 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: Defects4C; Large Language Model; Program Repair

TL;DR: Defects4C; a high-quality executable benchmark for C/C++ defects

Abstract: Automated Program Repair (APR) plays a pivotal role in ensuring the quality and reliability of software. However, most existing APR research focuses on Java programs, primarily due to the well-established benchmark such as Defects4J. Despite the significant prevalence of C/C++ vulnerabilities, the field lacks extensive research on the automated repair of such vulnerabilities, primarily attributed to the absence of high-quality open-source benchmarks in this domain. To address the critical gap in available datasets for C/C++ program repair, this paper introduces Defects4C, a comprehensive and high-quality executable benchmark designed to improve defect detection and repair. The dataset includes a vast collection of bug-relevant commits (e.g., **9M** in total), **248** high-quality buggy functions and **102** vulnerable functions paired with test cases for reproduction. These datasets can be used to evaluate repair techniques and to retrain learning-based methods for improved performance. Using this expanded dataset, we evaluate the performance of state-of-the-art LLM-based automated program repair techniques in addressing C/C++ faults. Specifically, we conduct an extensive empirical study with **24** leading LLMs. Our findings provide valuable insights into the capabilities and limitations of existing APR approaches for C/C++ programs, underscoring the necessity for novel APR techniques and the significance of Defects4C. This dataset marks a significant advancement in the field, offering a robust and comprehensive C/C++ dataset that is instrumental for future research on program repair.

Primary Area: datasets and benchmarks

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 9616

Loading