Abstract: We propose a novel framework for avoiding the misclassification of data by using a framework of learning with rejection and adversarial examples. Recent developments in machine learning have opened new opportunities for industrial innovations such as self-driving cars. However, many machine learning models are vulnerable to adversarial attacks and industrial practitioners are concerned about accidents arising from misclassification. To avoid critical misclassifications, we define a sample that is likely to be mislabeled as a suspicious sample. Our main idea is to apply a framework of learning with rejection and adversarial examples to assist in the decision making for such suspicious samples. We propose two frameworks, learning with rejection under adversarial attacks and learning with protection. Learning with rejection under adversarial attacks is a naive extension of the learning with rejection framework for handling adversarial examples. Learning with protection is a practical application of learning with rejection under adversarial attacks. This algorithm transforms the original multi-class classification problem into a binary classification for a specific class, and we reject suspicious samples to protect a specific label. We demonstrate the effectiveness of the proposed method in experiments.
Keywords: Learning with Rejection, Adversarial Examples
Original Pdf: pdf
7 Replies
Loading