Black-Box Adversarial Attack Guided by Model Behavior for Programming Pre-trained Language Models

Jie Zhang; Wei Ma; Xiaofei Xie; Qiang Hu; Yang Liu

Black-Box Adversarial Attack Guided by Model Behavior for Programming Pre-trained Language Models

Jie Zhang, Wei Ma, Xiaofei Xie, Qiang Hu, Yang Liu

22 Sept 2022 (modified: 13 Feb 2023)ICLR 2023 Conference Withdrawn SubmissionReaders: Everyone

Keywords: black-box, adversarial attack, pre-trained models for programming languages, code model

TL;DR: We use the uncertainty of model outputs to guide searching for adversarial examples by the variable name replacement.

Abstract: Pre-trained models for programming languages are widely used to solve code tasks in Software Engineering (SE) community, such as code clone detection and bug identification. Reliability is the primary concern of these machine learning applications in SE because software failure can lead to intolerable loss. However, deep neural networks are known to suffer from adversarial attacks. In this paper, we propose a novel black-box adversarial attack based on model behaviors for pre-trained programming language models, named Representation Nearest Neighbor Search(RNNS). The proposed approach can efficiently identify adversarial examples via variable replacement in an ample search space of real variable names under similarity constraints. We evaluate RNNS on 6 code tasks (e.g., clone detection), 3 programming languages (Java, Python, and C), and 3 pre-trained code models: CodeBERT, GraphCodeBERT, and CodeT5. The results demonstrate that RNNS outperforms the state-of-the-art black-box attacking method (MHM) in terms of both attack success rate and quality of generated adversarial examples.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics

Submission Guidelines: Yes

Please Choose The Closest Area That Your Submission Falls Into: Applications (eg, speech processing, computer vision, NLP)

7 Replies

Loading