Keywords: decision-based attacks, sparse attacks, evolution algorithms, vision transformer, convolutional neural network
Abstract: Despite our best efforts, deep learning models remain highly vulnerable to even tiny adversarial perturbations applied to the inputs. The ability to extract information from solely the output of a machine learning model to craft adversarial perturbations to black-box models is a practical threat against real-world systems, such as Machine Learning as a Service (MLaaS), particularly $sparse~attacks$. The realization of sparse attacks in black-box settings demonstrates that machine learning models are more vulnerable than we believe. Because, these attacks aim to $minimize~the~number~of~perturbed~pixels$—measured by $l_0$ norm—required to mislead a model by $solely$ observing the decision ($the~predicted~label$) returned to a model query; the so-called $decision-based~setting$. But, such an attack leads to an NP-hard optimization problem. We develop an evolution-based algorithm—$SparseEvo$—for the problem and evaluate it against both convolutional deep neural networks and $vision~transformers$. Notably, vision transformers are yet to be investigated under a decision-based attack setting. SparseEvo requires significantly fewer queries than the state-of-the-art sparse attack $Pointwise$ for both untargeted and targeted attacks. The attack algorithm, although conceptually simple, is competitive with only a limited query budget against the state-of-the-art gradient-based $white-box$ attacks in standard computer vision tasks such as $ImageNet$. Importantly, the query efficient SparseEvo, along with decision-based attacks, in general, raises new questions regarding the safety of deployed systems and poses new directions to study and understand the robustness of machine learning models.