Adversarially Robust Learning with Uncertain Perturbation Sets

Published: 21 Sept 2023, Last Modified: 06 Nov 2023NeurIPS 2023 posterEveryoneRevisionsBibTeX
Keywords: adversarially robust learning
TL;DR: We introduce a setting for adversarial robust learning that interpolates between a known perturbation type and entirely unknown perturbation types by providing a PAC type analysis with a class of perturbation types.
Abstract: In many real-world settings exact perturbation sets to be used by an adversary are not plausibly available to a learner. While prior literature has studied both scenarios with completely known and completely unknown perturbation sets, we propose an in-between setting of learning with respect to a class of perturbation sets. We show that in this setting we can improve on previous results with completely unknown perturbation sets, while still addressing the concerns of not having perfect knowledge of these sets in real life. In particular, we give the first positive results for the learnability of infinite Littlestone classes when having access to a perfect-attack oracle. We also consider a setting of learning with abstention, where predictions are considered robustness violations, only when the wrong prediction is made within the perturbation set. We show there are classes for which perturbation-set unaware learning without query access is possible, but abstention is required.
Supplementary Material: pdf
Submission Number: 14138