VFProber: A Vulnerability-Fixing Identification Framework Based on Code Changes and Semantic Adjustment
Abstract: With the accelerated development of software, developers face the continuous challenge of fixing vulnerabilities but vulnerability-fixing commits often disassociated from the vulnerabilities, and the structural and semantic differences between code changes and natural language present significant challenges in identifying these commits. Existing approaches utilize machine learning and deep learning techniques to address this problem, but they often do not fully leverage the information about code changes. In this paper, we propose VFProber, a method based on a code change pretrained model, aiming to provide a comprehensive and unified framework for identifying vulnerability-fixing commits. VFProber uses semantic adjustment to distinguish between context-sensitive and context-insensitive code units in code changes, thereby enhancing the model’s understanding of code changes during the training process. Secondly, VFProber employs a novel code change pretrained model as a feature extractor. Compared with ordinary code pretrained models, it can better meet the requirements of the vulnerability-fixing identification task. Moreover, we constructed a vulnerability-fixing dataset containing two common programming languages, Java and JavaScript, from industrial projects. In the experimental section, we designed three tasks to evaluate the method. The results show that, compared with the best baseline, VFProber performs better in the vulnerability-fixing identification task and can effectively reduce false positives and false negatives.
External IDs:dblp:conf/compsac/DongFYLYYC25
Loading