Cluster-Driven Adversarial Perturbations for Robust Contrastive Learning

Seongyun Seo; Sungmin Han; Jeonghyun Lee; Sangkyun Lee

Cluster-Driven Adversarial Perturbations for Robust Contrastive Learning

Seongyun Seo, Sungmin Han, Jeonghyun Lee, Sangkyun Lee

28 Sept 2024 (modified: 05 Feb 2025)Submitted to ICLR 2025EveryoneRevisionsBibTeXCC BY 4.0

Keywords: adversarial training, contrastive learning, adversarial contrastive learning

Abstract: Adversarial contrastive learning aims to learn a representation space robust to adversarial inputs using only unlabeled data. Existing methods typically generate adversarial perturbations by maximizing the contrastive loss during adversarial training. However, we find that the effectiveness of this approach is influenced by the composition of positive and negative examples in a minibatch, which is not explicitly controllable. To address this limitation, we propose a novel approach to adversarial contrastive learning, where adversarial perturbations are generated based on the clustering structure of the representation space learned through contrastive learning. Our method is motivated by the observation that contrastive learning produces a well-separated representation space, where similar data points cluster together in space, while dissimilar ones are positioned farther apart. We hypothesize that perturbations directed toward neighboring (the second nearest to be specific) clusters are likely to cross the decision boundary of a downstream classifier built upon contrastive learning, effectively acting as adversarial examples. A key challenge in our approach is to determine a sufficiently large number of clusters, for which the number of classes in the downstream task would serve the purpose but is typically unknown during adversarial contrastive learning. Therefore, we employ the silhouette score to identify the optimal number of clusters, ensuring high-quality clustering in the representation space. Compared to the existing approaches, our method achieved up to $2.25$\% and $5.05$\% improvements in robust accuracy against PGD and Auto-Attack, respectively, showing slight improvement in standard accuracy as well in most cases.

Supplementary Material: zip

Primary Area: unsupervised, self-supervised, semi-supervised, and supervised representation learning

Code Of Ethics: I acknowledge that I and all co-authors of this work have read and commit to adhering to the ICLR Code of Ethics.

Submission Guidelines: I certify that this submission complies with the submission instructions as described on https://iclr.cc/Conferences/2025/AuthorGuide.

Anonymous Url: I certify that there is no URL (e.g., github page) that could be used to find authors’ identity.

No Acknowledgement Section: I certify that there is no acknowledgement section in this submission for double blind review.

Submission Number: 13849

Loading