Corrupting Unbounded Unlearnable Datasets with Pixel-based Image Transformations

23 Sept 2023 (modified: 11 Feb 2024)Submitted to ICLR 2024EveryoneRevisionsBibTeX
Primary Area: representation learning for computer vision, audio, language, and other modalities
Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.
Keywords: Unlearnable datasets, deep neural networks, image transformations
Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2024/AuthorGuide.
Abstract: Unlearnable datasets (UDs) lead to a drastic drop in the generalization performance of models trained on them by introducing elaborate and imperceptible perturbations into clean training sets. Many existing defenses, e.g., JPEG compression and adversarial training, effectively counter UDs based on norm constraints (i.e., bounded UDs). However, the recent emergence of unbounded UDs renders existing defense measures completely ineffective, presenting a greater challenge to defenders. To address this, we express the unbounded unlearnable sample as the result of multiplying a matrix by a clean sample in a simplified scenario. Meanwhile, we note in existing unbounded UDs that the consistency of intra-class and inter-class noise significantly affects unlearnable effect, which motivates us to formalize the intra-class matrix inconsistency as $\Theta_{imi}$ and inter-class matrix consistency as $\Theta_{imc}$ and conjecture that increasing both of these metrics enhances the test accuracy. Through validation experiments that commendably support our hypothesis, we further design a random matrix to boost both $\Theta_{imi}$ and $\Theta_{imc}$, achieving a notable degree of defense effect. Hence, by building upon and extending these facts, we first propose a brand-new image COrruption that employs randomly multiplicative transformation via INterpolation operation (COIN) to successfully defend against existing unbounded UDs. Our approach leverages global pixel random interpolations, effectively suppressing the impact of multiplicative noise in unbounded UDs. Extensive experiments demonstrate that our defense approach outperforms state-of-the-art defenses, achieving an improvement of 23.55\%-48.11\% in average test accuracy on the CIFAR-10 dataset.
Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors' identity.
No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.
Submission Number: 7861
Loading