Adversarial for Good? How the Adversarial ML Community's Values Impede Socially Beneficial Uses of Attacks
Keywords: Adversarial Machine Learning, Ethical Impact
TL;DR: After analyzing the Broader Impact statements of Adversarial ML papers published in NeurIPS 2020, we find that hidden values of the adversarial ML field may keep researchers from using adversarial attacks for social good.
Abstract: Attacks from adversarial machine learning (ML) have the potential to be used ``for good'': they can be used to run counter to the existing power structures within ML, creating breathing space for those who would otherwise be the targets of surveillance and control. But most research on adversarial ML has not engaged in developing tools for resistance against ML systems. Why? In this paper, we review the broader impact statements that adversarial ML researchers wrote as part of their NeurIPS 2020 papers and assess the assumptions that authors have about the goals of their work. We also collect information about how authors view their work's impact more generally. We find that most adversarial ML researchers at NeurIPS hold two fundamental assumptions that will make it difficult for them to consider socially beneficial uses of attacks: (1) it is desirable to make systems robust, independent of context, and (2) attackers of systems are normatively bad and defenders of systems are normatively good. That is, despite their expressed and supposed neutrality, most adversarial ML researchers believe that the goal of their work is to secure systems, making it difficult to conceptualize and build tools for disrupting the status quo.
2 Replies
Loading