Keywords: computer vision, deep learning, adversarial attacks, metamorphic testing, adversarial defense, adversarial detection
TL;DR: An adversarial defense pipeline that relies on applying nonlinear image transformations and metamorphic testing to detect malicious adversarial input for face recognition systems.
Abstract: Adversarial examples pose a serious threat to the robustness of machine learning models in general and deep learning models in particular. Computer vision tasks like image classification, facial recognition, object detection, etc. and natural language processing tasks like sentiment analysis and semantic similarity assessment have all been proven vulnerable to adversarial attacks. For computer vision tasks specifically, these carefully crafted perturbations to input images can cause targeted misclassifications to a label of the attacker's choice, without the perturbations being detectable to the naked eye. A particular class of adversarial attacks called black box attacks can be used to fool a model under attack despite not having access to the model parameters or input datasets used to train the model. As part of the research presented in this paper, we first deploy a range of state of the art adversarial attacks against multiple face recognition pipelines trained in a black box setup, and then generate pair-wise adversarial image sets to deceive the corresponding models under attack. Consequently, we propose a novel approach for adversarial detection that utilizes statistical techniques to learn optimal thresholds of separation between clean and adversarial examples; achieving state of the art detection accuracies of over ~90%. Our proposed method has been exhaustively tested on multiple face recognition models under attack and adversarial attack type combinations with encouraging results.