Disentangling Improves VAEs' Robustness to Adversarial Attacks

Matthew Willetts; Alexander Camuto; Stephen Roberts; Chris Holmes

Disentangling Improves VAEs' Robustness to Adversarial Attacks

Matthew Willetts, Alexander Camuto, Stephen Roberts, Chris Holmes

25 Sept 2019 (modified: 05 May 2023)ICLR 2020 Conference Blind SubmissionReaders: Everyone

TL;DR: We show that disentangled VAEs are more robust than vanilla VAEs to adversarial attacks that aim to trick them into decoding the adversarial input to a chosen target. We then develop an even more robust hierarchical disentangled VAE, Seatbelt-VAE.

Abstract: This paper is concerned with the robustness of VAEs to adversarial attacks. We highlight that conventional VAEs are brittle under attack but that methods recently introduced for disentanglement such as β-TCVAE (Chen et al., 2018) improve robustness, as demonstrated through a variety of previously proposed adversarial attacks (Tabacof et al. (2016); Gondim-Ribeiro et al. (2018); Kos et al.(2018)). This motivated us to develop Seatbelt-VAE, a new hierarchical disentangled VAE that is designed to be significantly more robust to adversarial attacks than existing approaches, while retaining high quality reconstructions.

Code: https://www.dropbox.com/sh/1x3vctui9oo5max/AACSSHTaxl6AkNkpgevXU1KVa?dl=1

Original Pdf: pdf

10 Replies

Loading