A Textual Adversarial Attack Scheme for Domain-Specific Models

Jialiang Dong, Shen Wang, Longfei Wu, Huoyuan Dong, Zhitao Guan

Published: 2022, Last Modified: 18 Nov 2023ML4CS (2) 2022Readers: Everyone

Abstract: Most of the textual adversarial attack methods generate adversarial examples by searching solutions from a perturbation space, which is constructed based on universal corpus. These methods possess high performance when attacking models trained on universal corpus, whereas have a greatly reduced attack capability when attacking domain-specific models. In this paper, we inject domain-specific knowledge into the perturbation space and combine the new domain-specific space with the universal space to enlarge the candidate space for attacking. Specifically, for a domain-specific victim model, the corresponding corpus is used to construct a domain-specific word embedding space, which is utilized as the augmented perturbation space. Besides, we use beam search to augment the search range to further improve the attack ability. Experiment results, involving multiple victim models, datasets, and baselines, reflect that our attack method realized significant improvements on domain-specific model attack.

0 Replies