Keywords: data poisoning, neural network quantization
TL;DR: We study the robustness of neural networks quantization against data poisoning attacks.
Abstract: Data poisoning attacks refer to the threat where the prediction of a machine learning model is maliciously altered if part of the training data is contaminated. Such attacks have been well-studied in the context of full precision training, and yet under-explored in neural network quantization, which is becoming increasingly popular for reducing memory cost and inference time of large models. In this work, we deploy the poisoned data generated by existing SOTA attacks and reveal their poor transferability to quantized models, often rendering them largely ineffective. Thus, our experiments uncover a surprising side benefit of neural network quantization: it not only reduces memory footprint but also strengthens a model's robustness against data poisoning attacks.
Conversely, we also propose new quantization-aware attacks to explore the practicality of poisoning quantized models. Our experiments confirm that the new attacks improve attack effectiveness (test accuracy drop) across a number of quantization and poisoning setups, sometimes by 90\% in the best scenario.
Submission Number: 48
Loading