Abstract: As code intelligence and collaborative computing advances, code representation models (CRMs) have demonstrated exceptional performance in tasks such as code prediction and collaborative code development by leveraging distributed computing resources and shared datasets. Nonetheless, CRMs are often considered unreliable due to their vulnerability to adversarial attacks, failing to make correct predictions when faced with inputs containing perturbations. Several adversarial attack methods have been proposed to evaluate the robustness of CRMs and ensure their reliable in application. However, these methods rely primarily on code’s textual features, without fully exploiting its crucial structural features. To address this limitation, we propose STRUCK, a novel adversarial attack method that thoroughly exploits code’s structural features. The key idea of STRUCK lies in integrating multiple global and local perturbation methods and effectively selecting them by leveraging the structural features of the input code during the generation of adversarial examples for CRMs. We conduct comprehensive evaluations of seven basic or advanced CRMs using two prevalent code classification tasks, demonstrating STRUCK’s effectiveness, efficiency, and imperceptibility. Finally, we show that STRUCK enables a more precise assessment of CRMs’ robustness and increases their resistance to structural attacks through adversarial training.
Loading