Disrupting Model Training with Adversarial Shortcuts

Ivan Evtimov; Ian Connick Covert; Aditya Kusupati; Tadayoshi Kohno

Disrupting Model Training with Adversarial Shortcuts

Ivan Evtimov, Ian Connick Covert, Aditya Kusupati, Tadayoshi Kohno

Published: 21 Jun 2021, Last Modified: 04 May 2025ICML 2021 Workshop AML PosterReaders: Everyone

Keywords: adversarial shortcuts, disrupting training

TL;DR: Adversarial shortcuts encourage models to rely on non-robust signals rather than semantic features and prevent training of useful models.

Abstract: When data is publicly released for human consumption, it is unclear how to prevent its unauthorized usage for machine learning purposes. Successful model training may be preventable with carefully designed dataset modifications, and we present a proof-of-concept approach for the image classification setting. We propose methods based on the notion of adversarial shortcuts, which encourage models to rely on non-robust signals rather than semantic features, and our experiments demonstrate that these measures successfully prevent deep learning models from achieving high accuracy on real, unmodified data examples

Community Implementations: [![CatalyzeX](/images/catalyzex_icon.svg) 1 code implementation](https://www.catalyzex.com/paper/disrupting-model-training-with-adversarial/code)

2 Replies

Loading