Keywords: Representation Learning, Explainable Deep Learning
Abstract: Contrastive learning has led to substantial improvements in the quality of learned embedding representations for tasks such as image classification. However, a key drawback of existing contrastive augmentation methods is that they may lead to the modification of the image content which can yield undesired alterations of its semantics. This can affect the performance of the model on downstream tasks. Hence, in this paper, we ask whether we can augment image data in contrastive learning such that the task-relevant semantic content of an image is preserved. For this purpose, we propose to leverage saliency-based explanation methods to create content-preserving masked augmentations for contrastive learning. Our novel explanation-driven supervised contrastive learning (ExCon) methodology critically serves the dual goals of encouraging nearby image embeddings to have similar content and explanation, which we verify through t-SNE visualizations of embeddings. To quantify the impact of ExCon's embedding methodology, we conduct experiments on CIFAR100 as well as the Tiny ImageNet dataset and demonstrate that ExCon outperforms vanilla supervised contrastive learning \emph{both} in terms of classification accuracy and in terms of explanation quality of the model.
5 Replies
Loading