Perception Adversarial Attacks on Neural Machine Translation SystemsDownload PDF

Anonymous

17 Apr 2022 (modified: 05 May 2023)ACL ARR 2022 April Blind SubmissionReaders: Everyone
Abstract: With the advent of deep learning methods, Neural Machine Translation (NMT) systems have become increasingly powerful. However, deep learning based systems are susceptible to adversarial attacks, where imperceptible changes to the input can cause large, undesirable changes at the output of the system. To date there has been little work investigating adversarial attacks on sequence-to-sequence systems, such as NMT models. Previous work in NMT has examined attacks with the aim of introducing target phrases in the output sequence. In this work, adversarial attacks for sequence-to-sequence tasks are explored from an output perception perspective. Thus the aim of an attack is to change the perception of the output sequence. For example, an adversary may want to make an output sequence have an exaggerated positive sentiment. In practice it is not possible to run extensive human perception experiments, so a proxy deep-learning classifier applied to the NMT output is used to measure perception changes. Experiments demonstrate that the sentiment perception of NMT systems' output sequences can be changed significantly, with only small, imperceptible changes at the input sequences
Paper Type: short
0 Replies

Loading