Keywords: contrastive learning, multi-task learning
TL;DR: We propose conditional contrastive networks (CCNs) as a way to learn multiple disjoint similarity notions by projecting each similarity notion into a different subspace.
Abstract: A vast amount of structured information associated with unstructured data, such as images or text, is stored online. This structured information implies different similarity relationships among unstructured data. Recently, contrastive learned embeddings trained on web-scraped unstructured data have been shown to have state-of-the-art performance across computer vision tasks. However, contrastive learning methods are currently able to leverage only a single metric of similarity. In this paper, we propose conditional contrastive networks (CCNs) as a way of using multiple notions of similarity in structured data. Our novel conditional contrastive loss is able to learn multiple disjoint similarity notions by projecting each similarity notion into a different subspace. We show empirically that our CCNs perform better than single-label trained cross-entropy networks, single-label trained supervised-contrastive networks, multi-task trained cross-entropy networks, and previously proposed conditional similarity networks on both the attributes on which it was trained and on unseen attributes.