- Abstract: Neural machine translation (NMT) systems have received massive attention from academia and industry. Despite a rich set of work focusing on improving NMT systems’ accuracy, the less explored topic of efficiency is also important to NMT systems because of the real-time demand of translation applications. In this paper, we observe an inherent property of the NMT system, that is, NMT systems’ efficiency is related to the output length instead of the input length. Such property results in a new attack surface of the NMT system—an adversary can slightly changing inputs to incur a significant amount of redundant computations in NMT systems. Such abuse of NMT systems’ computational resources is analogous to denial-of-service attacks. Abuse of NMT systems’ computing resources will affect the service quality (e.g., prolong response to users’ translation requests) and even make the translation service unavailable (e.g., running out of resources such as batteries of mobile devices). To further the understanding of such efficiency-oriented threats and raise the community’s concern on the efficiency robustness of NMT systems, we propose a new attack approach, TranSlowDown, to test the efficiency robustness of NMT systems. To demonstrate the effectiveness of TranSlowDown, we conduct a systematic evaluation on three public-available NMT systems: Google T5, Facebook Fairseq, and Helsinki-NLP translator. The experimental results show that TranSlowDown increases NMT systems’ response latency up to 1232%and 1056% on Intel CPU and Nvidia GPU respectively by inserting only three characters into existing input sentences. Our results also show that the adversarial examples generated byTranSlowDowncan consume more than 30 times battery power than the original benign example. Such results suggest that further research is required for protecting NMT systems against efficiency-oriented threats.
- Supplementary Material: zip